
































Vot. 55, No. 4 


Jury, 1958 


Psychological Bulletin 


PROBLEMS AND METHODS OF PSYCHOPHYSICS! 


S. S. STEVENS 
Harvard University 


The methods and procedures of 
psychophysics have been reviewed 
from time to time and marshaled into 
more-or-less logical array. If another 
such inventory is now in order it is 
because recent developments allow 
us to put the various methods in new 
perspective and to see more clearly 
how they articulate with the prob- 
lems of psychophysics. 

Let us admit first off that a concern 
with method is justified only if it 
leads to something beyond itself. The 
study of method, which I suppose is 
the proper meaning of that over- 
worked term methodology, is one of 


‘ 


those ‘‘necessary evils’’ whose justifi- 
cation lies in its potential contribution 
to the solving of substantive prob- 
lems. But the problems are the main 
concern. If an empirical problem is 
worth solving, a method for it is 
worth developing, but it may turn 
out that there is little profit in fash- 
ioning tools to do what nobody wants 
done. Methodology can easily be- 
come methodolatry. Pe 
Psychophysical methods have at 
times been treated as though they 
were ends in themselves, and in 
many texts the term psychophysics 
has seemed to be synonymous with 


1 Supported by a grant from the National 
Science Foundation and by Contract Nonr- 
1866(15) with the Office of Naval Research 
(Project Nr142-201, Report PNR-190). Re- 
production for any purpose of the U. S. Gov- 
ernment is permitted. 


three ‘‘classical’’ procedures for solv- 
ing an issue that few people care 
about. Little wonder then that psy- 
chophysics has sometimes been ac- 
cused of inconsequence. Attitudes of 
this sort are not improved by the de- 
cision of a distinguished committee 
to define psychophysics as the use of 
a human observer as a “‘null instru- 
ment” to determine “equality or dif- 
ference of (20, p. 59). 
According to the view of this com- 
mittee, psychophysics is a strange 
land lying between “‘the physical and 
psychical” (20, p. 65). As Evans (9, 
p. 5) puts it, ““Psychophysics at pres- 
ent, therefore, is limited to the rela- 
tive evaluation of light beams with re- 
spect to normal 
standardized conditions. ... 

Psychophysics is really a much 
more nutritious subject than these 
conceptions imply. Seeking the laws 
that relate the responses of men and 
animals to the energetic configura- 
tions of the environment, it probes 
matters of deep human interest, and 
matters that often make a practical 
difference in the market place. For 
some of us, at any rate, a certain ex- 
citement attaches to the discovery 
that on ‘quantitative’ or prothetic 
perceptual continua, such as bright- 
loudness, heaviness, length, 
duration, etc., equal stimulus ratios 
produce equal sensation ratios (31). 
This principle means that the psycho- 
logical magnitude is a power function 


177 


sensations” 


observers under 


” 


‘ 


ness, 


178 


of the physical magnitude. And an 
example of practical utility can be 
seen in the application of this power 
law to the problem of predicting the 
loudness of a complex noise from 
physical measurements made on the 
spectrum of the sound (29, 33). 
Psychophysics has its problems 
and its methods. The purpose of 
this paper is to try to classify the 
methods in terms of the problems. It 
is a follow-up on some earlier at- 
tempts (25, 26), in which a similar 
point of view was tried out in part, 
but in which the coverage was less 
systematic. A related effort, but 
rather different in outcome, was 
made by Guilford (14). In attempt- 
ing an exercise of this sort we must 
realize that an element of arbitrari- 
ness attaches to taxonomies, and al- 
ternative schemes are always possi- 
ble. Furthermore, we must forego 
the ambition to be exhaustive and 
completely consistent, for the meth- 


ods of psychophysics are in a state of 
flux, and, as knowledge continues to 
expand, our understanding of pro- 
cedures will change and improve. 
The names that have become at- 
tached to the various methods show 
interesting vagaries, for in labeling 


our methods we fasten attention 
sometimes on one feature and some- 
times on another. The name refers 
sometimes to the manner of present- 
ing stimuli, sometimes to the task as- 
signed the observer, and sometimes 
to the statistical treatment used to 
process the data. 

Having names for methods pro- 
vides a convenient shorthand for the 
description of experiments—as well 
as some handy items to ask about on 
examinations. But labeling is not 
without its drawbacks. Labeling pro- 
duces jargon, and jargon leads to 
esoteric discourse. Many readers 
would find clarity improved if special 
names for procedures were banned 


5S. S. STEVENS 


and authors were forced to frame 
their descriptions in conventional 
English. 

But names for the methods are 
probably here to stay, and our pur- 
pose will be to classify rather than to 
abolish. Actually, my greater inter- 
est is in the development of a schema 
by which the methods may be classi- 
fied, rather than in any particular in- 
ventory of procedures, but the 
schema proposed can be illustrated 
by tables of methods. And since the 
methods exist to solve the problems 
of psychophysics, it is appropriate to 
comment on what certain methods 
may and may not achieve in the way 
of solutions. 


THE PSYCHOPHYSICAL PARAMETERS 

Psychophysics concerns the func- 
tional relation between stimulus and 
response: R=f(S). This function is 
affected by numerous parameters. 
For the purpose of classifying meth- 
ods it is convenient to distinguish 
three these parameters, 
namely, the task undertaken by the 
observer, the manner in which the 
stimuli are presented, and the statis- 
tical measure employed in the de- 
scription of the data. 
and their principal subdivisions are 
listed in Table 1. 

In Table 1 the various subdivisions 
of the three psychophysical param- 
eters—task, stimuli, and statistic— 
are designated by a capital letter. 
These letters will be used in Table 2 
to characterize the several psycho- 
physical methods. But first let us ex- 
amine the psychophysical parameters 
in a little more detail. 

Task 

The observer's task is normally set 
by means of instructions. The ob- 
server is ‘‘tuned”’ to react in one way 
rather than another, but from our 
present point of view only certain as- 


classes of 


These classes 











PROBLEMS AND METHODS OF PSYCHOPHYSICS 


179 


TABLE 1 


PSYCHOPHYSICAL PARAMETERS 


Task of observer 


is to judge 


Stimulus 
arrangement 


C Classification F 
OQ Order A 
I Intervals 

R Ratios 

M_ Magnitudes 


Fixed 
Adjustable 


are of 
quence. It is important, for example, 
whether the observer is told to judge 
brightness, or hue, or saturation, but 


pects of this tuning conse- 


we are not here concerned with this 
aspect of the Aufgabe. Rather we 
take the attentional focus for granted 
and then ask what type of relational 
judgments the observer is trying to 
make. These relational judgments 
fall into five groups, as follows. 
Classification (C). Here the ob- 
server's task is classification of one 
sort or another. He judges whether 
his perception meets some nominal 
criterion, with no reference to order 
among his perceptions. In the sim- 
plest case the observer, attending to 
some attribute or aspect of percep- 
tion, judges whether it is present or 
absent. Thus he may press a key if 
hears a tone, or if he hears a 
change in the tone. Or he may be re- 
quired to say in what quadrant of a 
circle a light appeared, or in what 
interval of time a click was sounded. 
The task is that of detection, and the 
observer is set to behave as a yes-no 
device. 


he 


In other cases the observer's 
task may be to judge equivalence, i.e., 
whether or not some criterion is met. 
The criterion may be set by a first 
stimulus, and the task may be to 
judge whether a second stimulus pro- 
duces an equivalent effect, e.g., the 
loudness of one tone may be adjusted 
to match that of another tone. Or the 
criterion may be established by in- 


Statistical measure 
L = Measure of location (central tendency) 
V Measure of variability or confusion 


struction, as when the subject is told 
to adjust a line to make it appear 
vertical, or to adjust wavelength to 
make a light appear pure green. 
Sometimes the classification problem 
is simply identification or recognition: 
is the present stimulus the same as 
some previous stimulus to which a 
name or number may have been as- 
signed? Or, for example, was the 
sound presented an English word, 
and which word was it? 

Order (Q). 


judge 


The observer is set to 
greater or heavier or 
lighter, louder or softer, etc. 

Intervals or distances (1). The ob- 
server's task is to judge apparent 
distance or difference between two or 
more perceptions. Ordinarily this 
takes the form of partitioning a con- 
tinuum into apparently equal inter- 
vals or assigning stimuli to categories 
that seem equally spaced along a con- 
tinuum. 

Ratios (R). The observer attends 
to the relative magnitudes of two or 
more perceptions and reports the ap- 
parent ratios among them. Alterna- 
tively he may be set to produce stim- 
uli that appear to stand in a pre- 
scribed ratio, which may be stated 
numerically or may be set in terms 
of some other pair of stimuli. 

Magnitudes (M). The observer 
judges the apparent magnitude of a 
perception. He usually attempts to 


less, 


assign numbers proportional to the 
apparent magnitudes of a series of 








180 


stimuli. Or he may be set to produce 
stimuli that correspond to a series of 
prescribed magnitudes. 
Stimulus Arrangement 

Since there is no end to the variety 
of procedures that may be used for 
the presentation of stimuli, it might 
appear that no useful criterion for 
classifying them is possible. On the 
other hand, there is an important 
procedural distinction between con- 
fronting the observer with a variable 
stimulus and confronting him with a 
fixed stimulus. Variable or adjust- 
able stimuli are better for some pur- 
poses, fixed stimuli for others. Asa 
practical matter, fixed stimuli can 
nearly always be used, but adjustable 
stimuli, unfortunately, are not always 
easy te come by. It is hard, for ex- 
ample, to devise continuously adjust- 
able weights for a lifted weight ex- 
periment, or continuously adjustable 
concentrations for experiments on 
taste. 

We will distinguish then between 
two stimulus arrangements: 

Fixed stimuli (F). Fixed stimuli 
are those that are not varied during 
the time they are being observed. Us- 
ually, of course, they are varied be- 
tween observations. 

Adjustable stimuli (A). Adjustable 
stimuli are those that may be altered 
during the course of observation. Us- 
ually the observer does the adjusting 
by operating a control, but the ex- 
perimenter may operate the controls, 
or the adjustments may be made 
automatically, as in the method we 
shall call ‘‘tracking.”’ 


Statistical Measures 


How the data from a psychophysi- 
cal experiment are processed usually 
depends on the experimenter’s pur- 
pose. Neglecting the secondary frills 
of statistical descriptions we can di- 
vide the usual treatments into two 


S. S. STEVENS 


classes, depending on whether the 
final measure used is one or another 
measure of location (or central ten- 
dency, so-called) or one or another 
measure of variability, confusion, or 
dispersion. 

Measures of location (L). For most 
purposes the measure we want to use 
is either a mean or a median. We 
want to know the typical response of 
a subject or of a group of subjects. 
Given a measure of location, a meas- 
ure of dispersion can then be used to 
gauge the precision of the judgments. 

The choice of a proper measure of 
location often presents interesting 
problems whose solutions are far 
from obvious. For one thing, we have 
a choice among such conventional 
measures as the mode, median, arith- 
metic mean, geometric mean, and 
harmonic mean. But it may turn out 
that none of these is appropriate. 
The arithmetic mean is inappropri- 
ate, for example, when applied to the 
readings obtained with a particular 
instrument whose indications are a 
nonlinear function of the quantity we 
want to average. Elsewhere (28) the 
writer has tried to suggest how an 
iterative procedure might aid in the 
solution of some of these problems. 
For a variety of reasons, a defensible 
rule concerning measures of location 
in psychophysical experiments is 
simply: when in doubt use the me- 
dian. The median has the advantage 
that it is invariant under nonlinear 
transformations so long as they are 
increasing and monotonic. 

Measures of variability (V). For 
our present purpose the measures of 
variability include the conventional 
measures of dispersion (standard de- 
viation, average deviation, inter- 
quartile range, etc.), as well as meas- 
ures of confusion, such as the propor- 
tions of times a given stimulus is 
judged in different categories. Meas- 
ures of variability are often used in 








the assessment of differential sensi- 
tivity or resolving power. They are 
also used, under various assumptions, 
as “‘distance’’ measures on psycho- 
logical continua. This ‘‘unitizing’’ of 
dispersion may be legitimate on some 
types of continua, but on quantita- 
tive and intensitive continua the as- 
sumptions commonly about 
discriminal dispersion are frequently 
in error (31). We will return to this 
problem later. 


7 } 
made 


PROBLEMS AND METHODS 


Some of the principal problems of 
psychophysics are listed in Table 2 
along with some of the typical meth- 
ods used in their solution. As is clear 
in Table 2, each of the problems of 
psychophysics may be regarded as 
one or another problem of scale con- 
struction construed in the widest 
sense of the term. This is scarcely 
surprising, for psychophysics, like 
most other parts of science, is mainly 
concerned with measurement. And 
since measurement is possible at dif- 
ferent levels, ranging from nominal 
to ratio, the basic problems of psy- 
chophysics can be classified in a way 
that reflects these different levels. 

In listing the problems in this way 
we must remind ourselves that they 
represent types or classes of issues 
that might concern the psychophysi- 
cist. Often the solution of one or an- 
other of these problems is not an end 
in itself, but is only a necessary step 
in the answering of a more far-reach- 
ing query. Especially in the more 
practical pursuits, such as human en- 
gineering, does it often turn out that 
the solving of problems like those in 
Table 2 is merely a means to an end. 
It has already been mentioned, for 
example, how the development of the 
sone scale of loudness has helped to 
answer a persistent problem facing 
the acoustical engineers (29, 33). 
Even more extensive commercial ap- 


PROBLEMS AND METHODS OF PSYCHOPHYSICS 


181 


plications are made of the Munsell 
color scales which are based on ex- 
tensive psychophysical studies. Ex- 
amples of this sort could be multi- 
plied at length, but our interest here 
is more in fundamentals than in ap- 
plications. 

The methods listed under each 
problem do not exhaust the possi- 
bilities, nor do they all qualify neces- 
sarily as good procedures. On the 
other hand, these methods illustrate 
the procedures commonly used. The 
names given to the methods are 
mostly those in general use, although 
an occasional name is new, as is also 
an occasional method. In construct- 
ing tables of this sort one can scarcely 
avoid thinking of new procedures 
that ought to be put to test. 

Let us now consider each problem 
in turn. 


I. NOMINAL SCALES 


The nominal scale is the most gen- 
eral type of scale. It is the primitive 
variety that involves only classifica- 
tion, with no ordering or metricizing. 
Perhaps under some definitions this 
simple process may qualify, not as 
measurement, but as a kind of half- 
way house on the road to it. On the 
other hand, there is no doubt that the 
psychological processes involved in 
the forming of classes, concepts, or 
categories present rich and varied 
problems. As a matter of fact, 
Bruner, Goodnow, and Austin (7) 
have written a whole book on the 
problem of ‘categorizing and con- 
ceptualizing,”” which is essentially 
a problem in nominal scaling. So too 
are the manifold problems in such 
areas as pattern recognition and ar- 
ticulation testing. 

In what follows, we will limit our 
interest to the three instances of nom- 
inal scaling that have long been cen- 
tral problems in psychophysics and 
to a fourth problem whose develop- 











S. S. STEVENS 
TABLE 2 


PSYCHOPHYSICAL PROBLEMS AND METHODS 


I. To determine nominal scales 
a. Absolute thresholds 


b. 


Cc. 


. Single stimuli 
. Counting 


3. Forced location (forced choice) 


4. Adjustment 
5. Limits 
6. Tracking 


7. Staircase (up-and-down) 
Resolving power or differential sensitivity 


1. Adjustment (average error) 
) 


. Tracking 


CAV-OAV-CAL 


3. Constant stimuli OFV 


4. Single stimuli 
5. ABX 


IFV-OFV 
CFV 


6. Forced location CFL 
7. Quantal increments CFL 
Equation of magnitudes 


1. Adjustment 


CAL 


2. Constant stimuli CFL-OFL 


3. Tracking 


CAL-OAL 


4. Staircase (up-and-down) CFL-OFL 
d. Identification 


1. Single stimuli 


CFV 


. To determine ordinal scales 
1. Pair comparison OFL 


? 


3. Rating scale 
4. Single stimuli 


2. Rank order (order of merit) OFL 


OFL-IFL 
CFV 


. To determine interval scales 

. Equisection (bisection) IAL-IFL 
. Interval estimation IFL 

. Category rating (equal intervals) IFL 

. Category production IAL 

. Pair comparison OFV 


. Rank order 


OFV 


. Successive categories IFV 
8. Successive intervals IFV 
IV. To determine logarithmic interval scales 
1. Pair comparison OFV 
2. Ratio matching RAL-RFL 


V. To determine ratio scales 


1. Ratio estimation RFL 

2. Ratio production (fractionation, multiplication) RAL-RFL 
3. Magnitude estimation MFL 
4 


. Magnitude production MAL 


Note.—The capital letters after each method refer to the psychophysical parameters in Table 1, Alternative 
procedures under a given method are indicated by multiple sets of letters. 


ment stems from information theory. where they belong. Each of them re- 


Although 


conventional treatments duces in one way or another to the 


do not usually subsume these prob- classification of stimuli. Thus, in 
lems under the heading of nominal measuring absolute thresholds we 
scales, my colleague, Ulric Neisser, form a twofold classification: those 
has pointed out to me that that is stimuli that can be perceived and 








PROBLEMS AND METHODS OF PSYCHOPHYSICS 


those that cannot. Similarly, in 
measuring resolving power or differ- 
ential sensitivity we divide stimulus 
increments into two classes, detect- 
able and not detectable. In the 
equation of magnitudes our task is 
obviously to form the class of equi- 
valent stimuli (according to a par- 
ticular criterion) and to set if off from 
the class of stimuli that are not equi- 
valent. And in the fourth problem 
our interest may be to determine into 
how many distinct classes a person 
can divide a set of stimuli without 
confusing any of them. Or it may 
concern the recognizability of stimu- 
lus configurations and the parameters 
that govern such recognition. In try- 
ing to solve these problems, we con- 
front the observer with the task of 
putting stimuli into classes. The 
psychophysical procedure involved 
requires only classification and does 
not require that the observer order 
his perceptions or judge the intervals 
or ratios among them. 

It may be true, of course, that 
under a particular experimental pro- 
cedure the observer may be asked to 
judge but if the 
problem is really one of nominal scal- 
ing (e.g., the measurement of resolv- 
ing power) the experimenter will pro- 
ceed to use the experimental results 
to determine class boundaries. In 
other words, the categories of the 
nominal scale will be abstracted from 
the observer's judgments of apparent 
order. 


“‘oreater or less,” 


Absolute Thresholds 


The absolute threshold, the point 
that divides the continuum of stimuli 
into those the subject can detect and 
those he ca: ‘, tends to elude our 
efforts to define its locus because it 
shifts about in time and we are forced 
to trap it by sampling and statistics. 
Mostly we use one or another version 
of the method of single stimuli. Fixed 


183 


stimuli are presented at various levels 
and the yes-no responses of the sub- 
ject are recorded. The class boundary 
which we call the threshold may be 
defined as the stimulus level detected 
half the time. 

The method herein called counting 
is a procedure sometimes used in the 
large-scale testing of inexperienced 
subjects. Several stimulus levels are 
presented in a series and the subject 
reports how many he _ perceived. 
Other variations on this method may 
call for the presentation of two or 
more stimuli at a fixed level. 


The method sometimes. called 


forced choice (5, 40) is not unlike 


counting except that, instead of ask- 
ing the subject to say how many stim- 
uli were presented, we ask him to say 
where in space or in which of several 
intervals of time the stimulus oc- 
curred. Since many methods call for 
a forced choice (e.g., constant stimuli 
used with two-category response), a 
better name for this procedure might 
be forced location. The subject tries 
to locate the stimulus in space oi 
time. Blackwell has studied many of 
the parameters of this process. 

The foregoing methods all make 
use of fixed stimuli. An adjustable 
stimulus can be used under the 
method of adjustment, in which the 
observer sets the level to be “barely 
detectable.’’ This procedure is quick 
and convenient, but ordinarily it al- 
lows only a rough determination of 
the threshold. 

If the level is varied systematically 
from points below and above thresh- 
old, and the subject signals when the 
threshold is crossed, we call the pro- 
cedure the method of /imits. 

The term tracking is suggested to 
designate a procedure made popular 
by the Békésy audiometer (2, 3). 
The observer presses a key whenever 
he hears a tone. As long as the key is 
pressed the level of the tone de- 








184 


creases, and when the key is released 
the level rises. By this means the sub- 
ject may track his threshold through- 
out the frequency range, and by 
means of a recording pen the track of 
the stimulus is traced across an ap- 
propriate grid. A smoothed curve 
through the zigzag track is usually 
drawn to depict the threshold locus. 

A method of tracking was appar- 
ently developed independently by 
Oldfield who used it for the measure- 
ment of visual thresholds (18). The 
method of tracking has also been used 
with animals. Blough (6) devised an 
experiment in which a pigeon was 
trained to peck at one key when a 
target was visible and at another key 
when the target was too dim to be 
seen. The pecks were made to control 
the brightness of the target, and the 
pigeon was thereby able to track its 
own dark-adaptation curve. 

The staircase method, sometimes 
called the “up-and-down” method 
(8), is much like the method of track- 
ing except that the stimulus level is 
not varied continuously. ‘The levels 
used are fixed and discrete, and the 
level presented on a given trial de- 
pends on the response made on the 
previous trial. Thus if the previous 
stimulus was detected, the level of 
the next stimulus is lowered a step; 
if the previous stimulus was not de- 
tected, the level is raised. 


Resolving Power 


The measurement of the difference 
limen, AS, has been a major concern 
of classical psychophysics, and enor- 
mous labors have gone into the refine- 
ment of methods. Yet there is no 
agreed-upon best procedure for de- 
termining the least resolvable dif- 
ference between two stimuli. Dif- 
ferential sensitivity is a difficult, 
“noisy” thing to measure, and it is 
not surprising that different ‘pro- 
cedures give different results. 


S. S. STEVENS 


Measures related to resolving power 
are sometimes obtained as a kind of 
by-product from such methods as ad- 


justment and tracking. Some measure 


of dispersion about the mean adjust- 
ment or the “‘center’’ of the track 
may be used to define the just notice- 
able difference. In principle, meas- 
ures of resolving power may be de- 
rived from the scatter of observations 
obtained from a wide variety of psy- 
chophysical procedures. 

The method of tracking has also 
been adapted to the direct measure- 
ment of just noticeable differences 
(19, 42). The trick here is to modu- 
late the level of a signal alternately 
up and down, and to let the observ- 
er's responses control the amount of 
the modulation. He presses a key 
whenever he detects modulation and 
releases it when the level of the stimu- 
lus no longer appears to rise and fall. 
When the key is pressed the degree of 
modulation declines and when the 
key is released the degree of modula- 
tion increases. Thus the observer is 
able to “track’”’ the just noticeable 
change in the signal. The average de- 
tectable modulation is determined by 
a curve drawn through the center of 
the zigzag track. This procedure has 
been used successfully with auditory 
stimuli, and there appears to be no 
reason why it should not lend itself 
equally well to some other types of 
stimuli. 

Although it is not listed in Table 2, 
the staircase adaptation of the 
method of tracking could also be used 
to determine the threshold of a mod- 
ulation. 

Note that in Table 2 the method of 
tracking is scored CAV-OAV-CAL. 
This is intended to suggest that the 
observer may be instructed to judge 
in different ways, or that different 
statistical measures may be used to 
define the differential threshold. 
Other double scorings in Table 2 








stand for alternative procedures un- 
der a given method. 

The three methods, constant stim- 
uli, single stimuli and ABX, all have 
much in common. Constant stimuli 
employs a standard and a series of 
comparison stimuli, and the observer 
judges ordinarily in terms of greater 
or less. Single stimuli (sometimes 
called absolute judgment) dispenses 
with the standard and the subject 
judges in terms of two or more cate- 
gories, such as light, medium, or 
heavy. In the ABX method the sub- 
ject reports (forced choice) whether 
the third stimulus is more like the 
first or the second of a series. In one 
study this method appeared to yield 
larger difference limens than _ the 
method of constant stimuli with a 
two-category forced-choice judgment 
(21). 

Under the method of forced location 
increments are added to a steady 
stimulus and the subject is required 
to say when in time, or where in 
space, the increment occurred (5). 
The measure of sensitivity may be 
taken as the increment correctly 
identified half the time, after correc- 
tion for chance. 

The method of quantal increments, 
sometimes called the quantal proce- 
dure (38), attempts to measure the 
size of the stimulus increment needed 
to produce an all-or-none jump in ef- 
fective excitation. As developed by 
Stevens and Volkmann (39), the pro- 
cedure calls for a steady stimulus to 
which brief increments are added 
periodically. The observer presses a 
key whenever he perceives an incre- 
ment. Under optimal conditions we 
obtain a rectilinear psychometric 
function of predictable slope, from 
which the size of the ‘“‘neural quan- 
tum” can be gauged in terms of stim- 
ulus units. Whenever the possibil- 
ities for stimulus control are such 
that this method can be used, the 


PROBLEMS AND METHODS OF PSYCHOPHYSICS 


185 


writer thinks it is the preferred pro- 
cedure. Psychometric functions of 
the predicted form have been ob- 
tained for pitch and loudness, and 
the data Mueller (17) obtained for 
brightness fit the linear functions 
predicted by the quantal hypothesis 
as well as, if not better than, they fit 
the sigmoid curves Mueller drew 
through them. And the slopes of 
some 45 psychometric functions all 
cluster about the predicted slope. 

On the other hand, some experi- 
menters have apparently not 
ceeded in reducing the variability 
and “‘noise,’’ whether in the experi- 
mental procedure or in the observers, 
to the point of obtaining clear 
“quantal functions.”” The task of 
proving the stepwise character of dis- 
crimination is not easy. It is in some 
ways like trying to prove that the 
charge on the electron is constant. 
Prior to Millikan’s oil-drop experi- 
ment, several attempts seemed to 
show that the charge was not con- 
stant but was probably normally dis- 
tributed. For the electron the physi- 
cist now knows pretty well how to set 
up a repeatable experiment that will 
demonstrate the all-or-none nature of 
the charge, but as yet, unfortunately, 
we cannot prescribe all the conditions 
that will guarantee a ‘“‘quantal”’ psy- 
chometric function. One reason is 
that the observer is too important a 
part of the specification. The writer 
can name several friends and col- 
leagues who have been able to hold 
the steady attention needed for this 
task, but he does not know how to 
specify the differences between them 
and the observers who have not done 
so well at it. 

Nevertheless, whether or not good 
clear quantal functions are obtained 
in any given experiment, the pro- 
cedure itself has much to recommend 
it for the purpose of mapping resolv- 
ing power. The method of quantal 


suc- 








186 


increments goes directly at the prob- 
lem of what increments can be de- 
tected, and it provides for internal 
checks on the ‘‘noisiness’’ of the re- 
sults obtained. 


Equation of Magnitudes 

The determination of equivalence 
is the problem we try to solve when 
we map such things as equal loudness 
contours, luminosity functions, con- 
tours of constant hue, etc. We try to 
determine the class of stimuli that ap- 
pear equal with respect to a particular 
attribute. It resembles what the 
economist does when he maps indif- 
ference curves for utility (35). 

For the typical matching problem 
the method of adjustment is usually 
the speediest and most straightfor- 
ward. It is not, however, without its 
constant errors (29). The method of 
constant stimuli is also widely used 
for this purpose, and it too has its 
constant errors, in particular, the so- 


called ‘‘time-error’”’ (31). 
The method of tracking can also be 
used to trace out an equivalence con- 


tour. Zwicker and Feldtkeller (41) 
presented two tones alternately. One 
was of fixed intensity and frequency 
(1000 cps), and the other was made 
to sweep slowly through the fre- 
quency range. The observer pressed 
a key whenever the variable tone 
sounded louder than the standard 
and released the key whenever the 
variable sounded fainter. While the 
key was pressed, a motor-driven at- 
tenuator decreased the level of the 
variable tone, and when the key was 
released the motor reversed its direc- 
tion and the variable grew louder. 
The data appear as a zigzag line 
traced by a pen across an audiometric 
chart, and an ‘‘average”’ line drawn 
through the zigzag tracing shows 
directly the form of the equal loud- 
ness contour. 


The staircase or ‘‘up-and-down” 


S. S. STEVENS 


method is an adaptation of the 
method of tracking for use with fixed 
values of the comparison stimulus. 
Applied to the problem of equating 
magnitudes, the staircase method is a 
kind of cross between constant stim- 
uli and tracking. It is like constant 
stimuli in that a standard and a set 
of fixed comparison stimuli are used, 
but it is like tracking in that the re- 
sponse of the subject determines the 
value of the subsequent comparison 
stimulus. For example, if the subject 
says “greater’’ the next comparison 
stimulus is decreased; if he says 
“‘less’’ the next comparison stimulus 
is increased. 

The mapping of invariances by 
means of a matching procedure is the 
basis of much of what we sometimes 
call measurement in psychophysics. 
When no other scales are available 
we often ‘‘measure”’ the effect of one 
factor in perception by changing that 
factor and then finding what altera- 
tion of another parameter will appear 
to undo the change. Thus we meas- 
ure the effect of intensity on pitch by 
changing the intensity a _ given 
amount and then altering the fre- 
quency to restore the pitch to its 
original state. Or the observer may 
adjust the intensity instead of the fre- 
quency (24). We then measure the ef- 
fect of intensity in terms of the fre- 
quency change required to cancel the 
effect of the change in intensity. Ex- 
amples of this sort could be multi- 
plied at length. 


Identification 


In the course of its burgeoning de- 
velopment, “communication theory” 
has had numerous impacts on psy- 
chology. Despite the fact that the 
theory deals only with the nominal 
properties of ensembles, not with 
their ordinal or interval properties, 
measures of information have found a 
use in many types of inquiry. And 








since the theory provides a mathe- 
matical model that concerns associa- 
tion at the nominal level of scales, it 
is not surprising that it should be put 
to work in psychophysics. 

A typical problem in this area con- 
the ability of observers to 
identify or recognize a stimulus in an 
absolute sense, without confusing it 
with other stimuli. The problem is 
related, of course, to differential sen- 
sitivity, but the concern is more with 
correct naming than it is with resolv- 
ing differences. 


cerns 


The distinction is 
something like that between: pitch 
discrimination on the one hand and 
so-called absolute pitch on the other. 
Absolute pitch concerns the ability 
of a listener to name or identify the 
note played to him. 

For many purposes it is important 
to know how many different stimuli 
(e.g., frequencies, intensities, colors, 
pressures, etc.) a person can identify 
with minimal error and to determine 
what factors affect information trans- 
mission on the different sensory con- 
tinua (1). It turns out that, on many 
of the common sensory continua, per- 
fect transmission of information 
(with no confusions) is ordinarily not 
possible with more than about five 
different stimuli (16). This fact is 
basic to many practical problems in 
the coding of information. 

Also related to such problems is 
the question concerning the best way 
to distribute stimuli along a contin- 
uum in order to maximize informa- 
tion Studies of this 
sort have led to what is sometimes 
called an “equal discriminability 
scale’ (10). Since the subject's task 
in these experiments is to identify 
and not to make comparisons, per- 
haps a better name would be “equal 
identifiability scale.’’ In any case the 
psychophysical method employed in 
these problems is single stimuli. The 
experimenter presents a stimulus and 


transmission. 


PROBLEMS AND METHODS OF PSYCHOPHYSICS 187 


the subject tries to give the appro- 
priate response. 

Problems of this sort are of course 
not limited to single stimulus dimen- 
sions. The problem of pattern recog- 
nition, for example, nearly always in- 
volves multivariate stimuli, usually 
visual or auditory although they may 
be tactual (12), and our interest is in 
how the observer is able to classify 
or recognize complex stimulus con- 
figurations. As already noted, it is to 
these problems that information the- 
ory has contributed useful tools. 


Il. ORDINAL SCALES 


lor the setting of perceptions in a 
rank order with respect to some as- 
pect or attribute we have three con- 
ventional methods: pair comparison, 
rank order, and rating scale. The use 
of these methods for the purpose of 
ordering is usually straightforward 
and devoid of special problems. In 
the sensory area the ordering of psy- 
chological magnitudes is seldom a 
serious problem, because subjective 
magnitude is usually a monotonic 
function of stimulus magnitude, and 
we take the ordering for granted. 
Occasionally, however, we find that 
the simple monotonic relation breaks 
down. For example, the apparent 
saturation of a light of a single wave- 
length grows with intensity up to a 
certain value, but when the intensity 
is increased further the color bi 
out and saturation declines. 

The problem of ordering is some- 
times extended to the relative spac- 
ing among stimuli, and the order of 
the spacing is sometimes deduced 
from confusions among the stimull. 
The reasonable assumption is made 
that, if stimulus B is confused with A 
more often than C is confused with A, 
then C is farther from A than B is 


iches 


from A. The method of single stimuli 
is one procedure that might be used 
in an experiment of this sort. 








188 


Although measurements of confu- 
sion may order stimuli relative to a 
single point, as in the foregoing ex- 
ample, there are situations in which 
relative distances cannot be ordered 
by measures of confusion. On pro- 
thetic continua, where jnd’s do not 
constitute subjectively equal dis- 
tances (31) it is not always true that 
distances can be ordered by this pro- 
cedure. Thus, if A and B are con- 
fused more often than C and D, it 
does not always follow that the dis- 
tance from A to B is less than from 
C to D. In particular, if A and B are 
two intense tones, and C and D are 
two faint tones (all of the same fre- 
quency), tones A and B may be 
farther apart in subjective magni- 
tude (sones) and still be confused 
more often. For example, tones of 
100 and 101 sones would be confused 
more often than tones of 1.0 and 1.5 
sones. 


III. INTERVAL SCALES 


Many interesting problems arise 
when we take on the task of erecting 
a scale of equal intervals on a psycho- 


logical continuum. The determina- 
tion of equal sense-distances seems 
to have originated with Plateau, who 
asked eight artists to paint a gray 
whose shade appeared equidistant be- 
tween a black and a white (see 31). 
In this manner, Plateau invented the 
method of bisection, which became 
equisection when the intervals were 
subdivided further. But Plateau, 
like many after him (see, for example, 
13), did not perceive the basic differ- 
ence between bisection and such 
“ratio methods” as fractionation. As 
we shall see, this distinction is very 
important for two reasons: (a) bisec- 
tion can lead at best only to an inter- 
val scale, and (6) it turns out that 
human observers are so constituted 
that they are generally unable to bi- 
sect an interval on a quantitative or 


S. S. STEVENS 


intensitive (prothetic) continuum 
without making a systematic error. 

The fact that bisection leads only 
to an interval scale is obvious enough. 
That subjects cannot perform valid 
bisections on certain types of con- 
tinua is not so obvious, however. 
Evidence for this statement is de- 
scribed elsewhere (31), but since the 
argument is relevant to our present 
concern, let us review it briefly. 

Systematic studies of more than a 
dozen perceptual continua have 
shown that these continua divide 
themselves into two varieties. Class 
I comprises the quantitative, intensi- 
tive continua, the continua concerned 
with how much. Discrimination on 
certain of these continua, such as 
loudness, brightness, and heaviness, 
seems to involve an additive process 
at the physiological level. For this 
reason Class I has been called pro- 
thetic. Class II includes the qualita- 
tive and positional continua, the con- 
tinua concerned with what or where. 
Discrimination on these continua 
seems to involve a substitutive proc- 
ess, and they are therefore called 
metathetic. 

Now, on prothetic continua we find 
that the psychological magnitude, 
as determined by ratio scaling pro- 
cedures, approximates a power func- 
tion of the stimulus magnitude. The 
rule is that equal stimulus ratios cor- 
respond to equal sensation ratios. 
This rule, the writer has suggested, is 
a basic ‘“‘psychophysical: law.’ The 
exponents of the power functions 
range from about 0.3 for loudness and 
brightness to about 3.5 for the ap- 
parent intensity of electric current 
applied to the fingers (34, 36). Since 
differential sensitivity on these con- 
tinua tends to approximate Weber's 
law, or rather the modified form of 
this law, AS=k(S-+c), it follows that 
the psychological magnitude repre- 
sented by AS increases as the stimulus 





PROBLEMS AND METHODS OF PSYCHOPHYSICS 


increases. Discrimination, in other 
words, is not constant over the con- 
tinuum, when measured in subjective 
units. There is a basic asymmetry in 
sensitivity. 

On the other hand, on metathetic 
continua, discrimination, measured in 
subjective units, tends to be uniform 
over the scale, and there is no system- 
atic asymmetry as there is with pro- 
thetic continua. This difference be- 
tween the two continua makes for 
rather different behavior on the part 
of the observer. On the “‘symmetri- 
cal’ metathetic continua, the results 
of bisection tend to agree with the 
results obtained by direct magnitude 
estimation and related procedures. 
But on the “asymmetrical” prothetic 
continua the point of bisection tends 
systematically to be lower than the 
point predicted by direct magnitude 
estimation. (Bisection is also plagued 
by a curious and dramatic order ef- 
fect, which has been called ‘“‘hys- 
teresis’’ [31].) 

The phenomena that characterize 
equisection also show up in the meth- 
od of interval estimation, which is a 
kind of inverse of equisection. In 
equisection the intervals are adjusted 
to meet some criterion (usually 
equality), whereas in interval estima- 
tion the experimenter sets a series of 
stimuli and the observer estimates 
their apparent spacing. A convenient 
procedure for reporting these estt- 
mates is to have the observer adjust 
the positions of a set of markers along 
a line. The apparent intervals be- 
tween successive markers are made to 
appear proportional to the apparent 
intervals between the stimuli. This 
method has been used with loudness 
and with lifted weights (31). The 
markers were movable sliders on a 
steel bar set before the observer. 
This procedure is in some ways anal- 
ogous to the use of a continuous rat- 
ing scale on which the judge places a 


189 


pencil mark on a line. Since apparent 
position (on a line) is a metathetic 
continuum on which discrimination 
is not asymmetrical, adjustments of 
visual position provide an unbiased 
method the apparent 
spacing of other stimuli—except for 
possible distortions due to end effects 
(37). Thus the essential features that 
haracterize equisection on prothetic 


of assessing 


Cl 
continua (hysteresis and the bias due 
to the asymmetry of sensitivity) are 
also revealed by the method of in- 
terval estimation. (For an interesting 
variation on the method of interval 
estimation, see 17a). 

The discrepancy between the ‘“‘in- 
terval” judgment and the judgment 
of magnitude is especially striking 
when we use the method of category 
rating, under which the observer as- 
signs a finite set of numbers or adjec- 
tives to a set of stimuli and tries to 
space the categories equally. Plotted 
against the ratio scale of subjective 
magnitude, the category scale has 
turned out to be concave downward 
on nine prothetic continua recently 
examined (36, 37). This nonlinearity 
in the category scales shows up even 
when the “pure’’ form of the category 
scale is obtained by a process of ex- 
perimental iteration. 

On metathetic continua, such as 
pitch, position, inclination, and pro- 
portion, the category scale may be 
linearly related to the magnitude 
scale, provided the distortions due to 
stimulus spacing, landmarks, and dif- 
ferential familiarity have been neu- 
tralized. These factors are some of 
the second-order variables that can 
alter the form of the category scale. 

Another method for obtaining a 
category scale is the method of cate- 
gory production (37). This isa kind of 
inverse of category rating. Instead of 
asking the observer to assign cate- 
gories to the stimuli, the experi- 
menter names the categories, in irreg- 


‘ 











190 : S: 


ular order, and the observer adjusts 
the stimulus to produce his concep- 
tion of each category. Examples of 
the extreme categories (e.g., No. 1 
and 7) may be presented to the ob- 
server at the outset. In our few tests 
of this method we found that it 
seemed to give directly a close ap- 
proximation to the “pure” category 
scale that would be obtained by ex- 
perimental iteration. 

What is essentially the method of 
category production has also been 
used to study how people make linear 
interpolations in a spatial interval 
(22). This isa metathetic continuum. 
The experimenter is usually con- 
cerned with the ‘objective’ accuracy 
of the observer's settings, although 
he may also be concerned with the 
form of the observer’s subjective 
scale. 

In summary, then, the four meth- 
ods, equisection, interval estimation, 
category rating, and category pro- 
duction, are all designed to produce 
an interval scale of ‘equal sense- 
distances.”” If properly used, they 
can achieve this end on metathetic 
continua, but on prothetic continua 
they fail to produce intervals that are 
equal, as measured by the ratio scales 
of the continua. 

We turn now to the class of meth- 
ods that seek to produce equal inter- 
vals via the ‘“‘unitizing’” of one or an- 
other measure of variability. Pair 
comparison is perhaps the best known 
example of this procedure, but the 
underlying philosophy is similar for 
the other three methods listed in 
Table 2 (see 14). By making certain 
simple assumptions regarding the dis- 
tribution of the observed variabilities 
or confusions, we try to deduce the 
form of the underlying continuum. 
It would appear that on metathetic 
continua, where sensitivity to dif- 
ferences is uniform (in subjective 
units), the distribution assumptions 


S. STEVENS 


most commonly invoked may lead to 
an interval scale. But on prothetic 


continua, the assumptions of normal 
and uniform variability are demon- 
strably in error, and therefore the re- 
sulting scales are not scales of equal 


intervals. On prothetic continua the 
procedures ordinarily used to derive 
equal intervals from measures of vari- 
ability or confusion miss the mark 
for the same reason that ‘“Fechner’s 
law” fails: the subjective size of the 
jnd is not constant over the contin- 
uum. Likewise ‘‘discriminal disper- 
sion” is not uniform over a prothetic 
continuum. Asa matter of fact, the 
psychophysical power law, coupled 
with the relativity of resolving power, 
leads us to predict that discriminal 
dispersion is not distributed normally 
(or even symmetrically) on a linear 
subjective measure of a prothetic con- 
tinuum, although it may be normal 
on a logarithmic measure of the con- 
tinuum. 

We must conclude, I think, that 
those procedures that make use of an 
assumed canonical distribution of 
variability are less useful for scaling 
than methods that utilize directly a 
measure of location. Even so, in the 
determination of equal intervals on 
prothetic continua, these latter meth- 
ods are themselves subject to invali- 
dating biases. It appears, therefore, 
that the only proper method for de- 
termining equal intervals on a pro- 
thetic continuum is to construct a 
ratio scale (see Section V). This solu- 
tion is possible because the ratio scale 
contains the interval scale. 

IV. LoGARITtHMIC INTERVAL SCALES 

The possibility that discriminal 
dispersion may increase proportional 
to the psychological magnitude on a 
prothetic continuum suggests that an 
assumption to this effect might make 
it possible to scale the continuum 
into intervals that are equal in terms 





PROBLEMS AND 


of logarithms. So far as the writer is 
aware, no use has been made of such 
scales, although he has elsewhere (31, 
35) described some of their properties, 
including their mathematical group 
structure. In proceeding in this fash- 
ion we would be assuming that the 
conventional procedures used to scale 
a continuum by the method of pair 
comparison give us, for prothetic con- 
tinua, a scale on which the values are 
separated not by equal intervals, 
but by equal ratios, ie. a/l 
=c/d=... 

The equating of ratios, either by 
way of a processing of variability or 
by a direct judgment of apparent 
ratios, would provide the basis for 
what the writer has called a Jloga- 
rithmic interval scale. This scale is 
invariant under a power transforma- 
tion, i.e., for any value x we can sub- 
stitute x’ where x’ =ax°*, and where a 
and } are positive numbers. As with 
the linear scale of equal intervals, the 
zero point on the logarithmic scale 
can be chosen arbitrarily and moved 
at will. 

If such a scale were desired, the 
straightforward procedure for achiev- 
ing it would presumably be some pro- 
cedure of direct matching. 
Methods of this type have not been 
very thoroughly explored, although 
Garner (11) tried equating loudness 
He seemed to find evidence 
that observers may not be able to 
keep separate the two tasks, that of 
equalizing ratios and that of equaliz- 
ing intervals. On the other hand, 
J. C. Stevens obtained interesting re- 
sults when two brightnesses were 
used to define a subjective ratio, and 
the observer adjusted the ratio be- 
tween two loudnesses to make the 
ratio between the loudnesses match 
the apparent ratio between the 
brightnesses (31). 

Ratio matching of this kind may 
have utility for psychophysics, be- 


} } - 
0=0,;¢ 


ratio 


ratios. 


METHODS OF PSYCHOPHYSICS 


191 


cause it provides an alternative 
method for demonstrating that the 
“psychophysical law”’ governing pro- 
thetic continua is a power function, 
i.e., the psychological magnitude is 
equal to the stimulus magnitude 
raised to a power. In principle, the 
power function can be tested by this 
method without requiring the ob- 
server to make numerical estimates 
of ratios or of magnitudes (as de- 
scribed in the next section). The ob- 
server is required only to make the 
apparent ratio between one pair of 
stimuli equal the apparent ratio be- 
tween another pair of stimuli. Then 
if it turns out that the ratio of one 
pair always equals the ratio of the 
other pair raised to a power, it fol- 
lows that the psychological magni- 
tudes are power functions of their 
respective stimuli. 

For example, suppose the subject 
adjusts pairs of luminances (B, and 
B.) to make the ratio of the bright- 
nesses equal the ratio of the apparent 
lengths of two lines (Li; and L.). If 
for all possible pairs so matched we 
find 


L, le= (By Bs)" 


where m is a constant, then both sub- 
jective length and subjective bright- 
ness are power functions of their re- 
spective stimuli. The limitation on 
this approach is that we could not by 
this procedure determine the value of 
the two exponents involved. That 
would require the additional methods 
described below. We could, however, 
by such ratio matchings determine 
the relative values of the exponents 
for length and brightness. 

Whether observers can make such 
ratio matches with sufficient con- 
sistency to make it a profitable pro- 
cedure has not been explored very 


thoroughly. Nevertheless, the ex- 


periments involving the matching of 
brightness 


ratios of loudness and 








192 


have thus far been encouraging. In 
this particular case, the observers ad- 
justed the two sounds to approxi- 
mately the same physical ratio as the 
experimenter set between the two 
luminances. This outcome is con- 
sistent with other evidence that loud- 
ness and brightness are both power 
functions of their stimuli, not loga- 
rithmic functions as Fechner sup- 
posed, and that the exponents for 
both loudness and brightness are ap- 
proximately the same. By means of 
the ratio scaling procedures dis 
cussed below we have shown that 
both exponents are of the order of 0.3 


(31). 


V. Ratio SCALES 
Perhaps the lack of interest in loga- 
rithmic interval scales stems from the 
fact that the scientist's greater in- 
terest lies in ratio scales. He is less 
interested in ratios whose values are 
equal but indeterminate than he is in 


ratios whose values he can specify. 
If he has a procedure for equating 
ratios, plus a procedure for equating 
intervals, he can proceed to construct 
a ratio scale on which the zero point 
is not arbitrary (26). 

We have seen, however, that ob- 


servers exhibit a systematic bias 
when they try to equate intervals on 
a prothetic continuum. Whether, 
and under what conditions, observ- 
ers can equate a series of unknown 
ratios has not been fully explored. 
In view of this state of affairs, how do 
we create ratio scales on a perceptual 
continuum? 

The answer seems to be that we 
ask the subject to judge the value of 
the ratio, or of the magnitude, di- 
rectly. Using one or another, or pref- 
erably combinations, of four different 
methods we proceed directly to the 
goal of assessing relative psychologi- 
cal magnitudes. The potential meth- 
ods are as follows: 


Ss. S. STEVENS 


Ratio estimation calls for the pres- 
entation of two or more stimuli and 
the observer names the value of the 
apparent ratio between them. The 
so-called constant sum method (15) is 
a special instance of this procedure. 
As typically used, this method re- 
quires that the observer divide 100 
points between two stimuli in such a 
way that the division between the 
points reflects the apparent ratio be- 
tween the sensations. In general, 
however, there is little reason to re- 
strict the observer's method of report 
in this manner. Restrictions and 
constraints on the observer are often 
a source of trouble and bias in ratio 
and magnitude judgments. 

It should be mentioned that, in us- 
ing any of the methods for determin- 
ing ratio scales, the experimenter will 
generally do well to compute medi- 
ans, for there is no limit to how far 
an occasional observer may deviate 
from the rest of the group. 

Ratio production is probably best 
known by the name of one of its sub- 
varieties, fractionation. The observer 
adjusts a stimulus to produce a pre- 
scribed ratio between two apparent 
magnitudes. He sets a variable to be 
$, 3, 3, or some other fraction of a 
standard. Alternatively the experi- 
menter may set the stimuli and the 
observer may report whether they 
meet the criterion of a prescribed 
ratio. This procedure is analogous to 
the method of constant stimuli and is 
sometimes called by this name. The 
principal drawback of fractionation 
with fixed stimuli is that the choice of 
the levels at which the comparison 
stimuli are set may be critical. This 
difficulty can presumably be sur- 
mounted if the spacing of the com- 
parison stimuli is determined by a 
process of experimental iteration 
under which the levels are altered in 
accordance with the outcome of suc- 
cessive experiments in such a way 








PROBLEMS AND 


that the “criterion’’ stimulus is made 
to lie at or near the center of the 
range of the comparison stimuli (37). 

The method of fractionation has its 
inverse in the method of mul/tiplica- 
tion under which the observer sets a 
variable to some prescribed multiple 
of astandard. In order to balance out 
certain sources of bias it is nearly al- 
ways Wise to complement fractiona- 
tion with multiplication (32). This 
is especially true when wide ranges of 
stimulus values are explored in ex- 
periments on loudness and_ bright- 
Hess. 

Magnitude estimation refers to a 
procedure by which the observer 
makes a direct numerical estimation 
of the psychological magnitudes of a 
series of perceptions (30). Two main 
varieties of this procedure have been 
used. Under one of them the experi- 
menter presents a stimulus and as- 
signs it a number (modulus) such as 
10, say. He then presents other stim- 
uli, and the observer assigns to them 
numbers proportional to their appar- 
ent magnitude. Under the other var- 
iation in the method, no modulus is 
prescribed. The stimuli appear in ir- 
regular order and the observer as- 
signs numbers proportional to mag- 
nitude, using a modulus of his own 
choosing. 

The method of direct magnitude 
estimation has given good results 
with loudness, brightness, lifted 
weights, duration, lightness of grays, 
visual length, pitch, proportion (37), 
finger span (unpublished), vibration 
(unpublished), and electric shock (34, 
36). Incidentally, with this method 
it often turns out that the geometric 
mean falls close to the median. 

Magnitude production is a method 
that has been named but not yet 
thoroughly explored (31). It is the in- 
verse of magnitude estimation in that 
the experimenter states a series of 
magnitudes (presumably in irregular 


METHODS OF PSYCHOPHYSICS 


193 


order) and the observer adjusts a 
stimulus to produce them. In some 
ways the procedure resembles the 
method of category production de- 
scribed above, except that no range is 
specified and the observer tries to 
judge in terms of apparent magni- 
tudes and not in terms of a finite 
number of prescribed categories. 
Magnitude production is a poten- 
tially interesting method, provided 
the stimulus control is such that the 
subject can adjust the stimulus over 
the required range, but many ques- 
tions concerning its peculiarities and 
difficulties remain to be answered. It 
is not unlikely that the biases in mag- 
nitude production are such that, in a 
balanced program, they might be 
used to offset some of the systematic 
errors in magnitude estimation. It 
seems, in general, that each of the 
ratio-scaling methods may contain 
biases peculiar to itself, and that the 
elimination of the biases can some- 
times be achieved by means of a 
counterbalanced design in which the 
biases inherent in one method are 
evaluated and corrected by means of 
a method that contains biases of an 
opposite sort. The principle is anal- 
ogous to that employed in the use 
of a balance in weighing an object: 
in order to discover and correct for 
possible asymmetries in the balance 
we interchange the weights on the 
scale pans. 

We have already noted that, on 
prothetic continua, ratio scales of 
psychological magnitude turn out to 
be power functions of the stimulus 
magnitude. The power law seems to 
hold on at least 16 perceptual con- 
tinua. On metathetic continua, the 
ratio scaling methods may or may 
not give a power function. When a 
power function is found to hold, the 
exponent appears to be 1.0. But ona 
metathetic continuum like pitch, the 
psychological magnitude, measured 











194 


in mels, is definitely not a power func- 
tion of frequency in cycles per second 
(37). 

Another important point to note is 
that, when prothetic continua are 
scaled by the ratio methods, Fech- 
ner’s law obviously fails. The relation 
between perceptual magnitude and 
stimulus magnitude is not logarith- 
mic. Even if we take the weaker form 
of Fechner’s law, which says that the 
counting off of jnd’s gives the scale of 
perceptual magnitude, we find that 
the law fails almost as badly. Suc- 
cessive jnd’s, in other words, are not 
subjectively equal. 

On metathetic continua, on the 
other hand, the jnd scale apparently 
coincides with the scale of psycho- 
logical magnitude. Thus, to a fair 
approximation, jnd’s in pitch repre- 
sent constant increments in mels 


(27). 


THE ESTIMATION OF OBJECTIVE 
VALUES 


The methods of psychophysics are 
ordinarily designed to solve problems 
related to the nature of organisms. 
The focus of interest is typically the 
normal observer, his thresholds, his 
resolving powers, and the magnitudes 
of his perceptions. The methods 
have, of course, their clinical uses, 
and the assessment of individual dif- 
ferences is central to many practical 
undertakings. 

Quite different in aim, but some- 
times similar in procedure, is another 
human activity involving discrimi- 
nation and judgment. This is the use 
of the human being as an instrument 
to measure the objective values of 
things. Despite the ingenuity of 
modern instrumentation, many tasks 
of rating, grading and judging can 
still best be done by two-legged 
meters (cf. 23). Man’s sensing, dif- 
ferentiating, and integrating circuits 


S. S. STEVENS 


still surpass in flexibility and power 
any inanimate substitutes yet de- 
vised. Instruments may aid but they 
do not displace the wine taster, the 
leather grader, the lumber sorter, or 
any of a host of other judges on 
whom commerce depends for the 
appraisal of its wares. Little of this 
type of activity gets attention in the 
academic laboratory, although much 
could probably be learned from its 
systematic study. 

In the framework of our present 
concerns, the assessing and grading 
of objective things are practical prob- 
lems—a potential field perhaps for an 
applied psychophysics. The chief 
difference between problems of this 
type and those we have been consid- 
ering lies in the point of view. In 
addressing any of the five problems 
in Table 2 we seek to learn the proper- 
ties of the human instrument; in 


problems of grading we care nothing 
about the properties of the instru- 


ment as such, but only about the 
accuracy of its indications. In the 
grading of wool, for example, the 
mill owner hopes the assessment of 
the “‘clip’”’ will tell him more about 
the wool than it about the 
grader. He hopes in other words, 
that the grader will commit the 
“stimulus error” 100°%. The psycho- 
physicist presenting a series of tones 
to be judged for loudness hopes quite 
the opposite. He wants the subject 
to report apparent loudness and not 
to judge how many decibels are prob- 
ably being produced by the earphone. 
Some experienced judges can do 
either task at will, but the estimation 
of decibel levels requires a lot of 
training. How the properties of the 
judge, as appraised and systematized 
by psychophysics, interact with the 
applied problems of grading and rat- 
ing is a potentially interesting prob- 
lem. 

The capacity of the human instru- 


does 








ment to make correct assessments of 
this or that is also a central issue in 
some branches of engineering psy- 
chology, and it is in these connections 
that the most studies 
have been made. This is not the place 
to review this lively field, but an 
illustrative example may be in order. 
It concerns the continuous control 
of a complex system, like the problem 
faced by the pilot taking a large ship 
through the narrow channels of the 
Suez Canal. The same kind of prob- 
lem comes home to many of us when 
we try to back an automobile with a 
luggage trailer tied on behind. We 
watch the trailer to see where it is 
going. It goes to the left, let us say, 
when it should go to the right, so we 
turn the front wheels of the car and 
watch to see what happens. 


systematic 


The 
trailer corrects its course to the de- 
direction, whereupon we 
straighten out the front wheels and 
continue to back up. To our dismay 
the trailer keeps turning to the right. 
We find we have overcorrected, be- 
cause we attended only to the “‘out- 
put”’ of the system, and did not take 
into account the delay between the 
input control (turning the 


sired 


front 


REFE 
Conditions affecting the 
information in 


Psve hol. Rev., 


1. Attutsi, E. A. 
amount. of 
judgments. 
97-103. 

von Béxftsy, G. Uber ein neues Audi- 
ometer. Arch. elekt. Ubertragung, 1947, 
A; oa: 

3. von BExKEsy, G. A 
Acta. Oto-Laryng 
411-422. 

4. BrrmMincHaM, H. P., & TAyLor, F. V. A 
design philosophy for man-machine 

control systems. Proc. I.R.E., 1954, 42, 
1748-1758. 

BLACKWELL, H. R. Psychophysical 
thresholds. Engng Res. Bull. No. 36, 
Engineering Research Institute, Univer. 
of Michigan, 1953. 

6. BLouGcn, D. S. Method for tracing dark 


absolute 
1957, 64, 


new audiometer. 


Stockh., 1947, 35, 


mn 


PROBLEMS AND METHODS OF PSYCHOPHYSICS 


195 


wheels) and the final movement of 
the trailer. With enough practice, a 
driver can learn to manage the pro- 
cedure and even to back the trailer 
into a garage. He must learn not 
only to judge the position and direc- 
tion of the trailer, but also to act as 
an integrator and predict the effects 
of his control actions as they will be 
summed up over a period of time. 
Problems of this sort have opened 
new fields of study concerned with 
man-machine control systems (4). 
One of the primary problems is to 
learn what objective aspects of the 
situation the operator can judge with 
greatest reliability, and then to dis- 
play to him only those features that 
match his judgmental capacity. It 
has been found, for example, that best 
results are achieved when the human 
the task of 
performing integrations and _ differ- 
entiations. The controls and displays 
must be engineered in such a way 
that the operator can effectively con- 
trol the system by acting as a simple 
amplifier. The need for a complex 
and difficult judgment of objective 
values is thereby replaced by a 
simpler demand on judgment. 


operator is relieved of 


RENCES 


adaptation in the pigeon. Science, 1955, 
121, 703-704. 

7. BRUNER, J]. S., Goopnow, J. J., & AusTIN, 
G. A. A study of thinking. New York: 
Wiley, 1956. 

8. Dixon, W. J., & Massey, F. J., Jr. In- 
troduction to statistical analysts. New 
York: McGraw-Hill, 1951. 

9. Evans, R. M. An introduction to color. 
New York: Wiley, 1948. 

10. GaRNER, W. R. An equal discriminability 
scale for loudness judgments. J. exp. 
Psychol., 1952, 43, 232-238. 


11. Garner, W. R. A technique and a scale 


for loudness measurement. J. acoust. 
Soc. Amer., 1954, 26, 73-88. 
12. Ge_-parp, F. A. Adventures in tactile 


literacy. 
115-124. 


Amer. Psychologist, 1957, 12, 











196 S. S. STEVENS 


13. Granam, C. H. Behavior, perception and 
the psychophysical methods. Psychol. 
Rev., 1950, 57, 108-120. 

14. GuiLForp, J. P. Psychometric methods 
(2nd ed.) New York: McGraw-Hill, 
1954. 

15. METFESSEL, M. F. A proposal for quanti- 
tative reporting of comparative judg- 
ments. J. Psychol., 1947, 24, 229-235. 

16. Mitter, G. A. The magical number 
seven, plus or minus two: Some limits 
on our capacity for processing informa- 
tion. Psychol. Rev., 1956, 63, 81-97. 

17. Muevtiter, C. G. Frequency of seeing 
functions for intensity discrimination at 
various levels of adapting intensity. 
J. gen. Physiol., 1951, 34, 463-474. 

17a. NEWHALL, S. M. A method of evaluat- 
ing the spacing of visual scales. Amer. 
J. Psychol., 1950, 63, 221-228. 

18. O_pFIELD, R. C. Continuous recording of 
sensory thresholds and other psycho- 
physical variables. Nature, 1949, 164, 
581. 

. OLDFIELD, R. C. Apparent fluctuations of 
a sensory threshold. Quart. J. exp. 
Psychol., 1955, 7, 101-115. 

. OpticaL Socrety oF AMERICA, CoMMIT- 
TEE ON COLORIMETRY. The science of 
color. New York: Crowell, 1953. 

. Rosensuitu, W. A., & STEVENS, K. N. 
On the DL for frequency. J. acoust. 
Soc. Amer., 1953, 25, 980-985. 

. SCHUBERT, R. G., & JENKINS, W. L. The 
effect of brief training on linear interpo- 
lation. J. appl. Psychol., 1956, 40, 53- 
54. 

. Scott Briatr, G. W. Measurements of 
mind and matter. New York: Philo- 
sophical Library, 1956. 

STEVENS, S. S. The relation of pitch to 
intensity. J. acoust. Soc. Amer., 1935, 6, 
150-154. 

. STEvENS, S. S. Sensation and psycho- 
logical measurement. In E. G. Boring, 
H. S. Langfeld, and H. P. Weld (Eds.), 
Foundations of psychology. New York: 
Wiley, 1948. 

. STEVENS, S. S. Mathematics, measure- 
ment and psychophysics. In S. S. 
Stevens (Ed.), Handbook of experimen- 
tal psychology. New York: Wiley, 1951. 

. STEVENS, S. S. Pitch discrimination, 


mels, and Kock’s contention. J. acoust. 
Soc. Amer., 1954, 26, 1075-1077. 


. STEVENS, S. S. On the averaging of data. 


Science, 1955, 121, 113-116. 


. Stevens, S. S. The calculation of the 


loudness of complex noise. J. acoust. 
Soc. Amer., 1956, 28, 807-832. 


. STEVENS, S. S. The direct estimation of 


sensory magnitudes—loudness. Amer. 
J. Psychol., 1956, 69, 1-25. 


. STEVENS, S. S. On the psychophysical 


law. Psychol. Rev., 1957, 64, 153-181. 


. STEVENS, S. S. Concerning the form of 


the loudness function. J. acoust. Soc. 
Amer., 1957, 29, 603-606. 


. StEvENS, S. S. Calculating loudness. 


Noise Control, 1957, 3, No. 5, 11-22. 


4. Stevens, S. S. Measurement and man. 


Science, 1958, 127, 383-389. 


. STEVENS, S. S. Measurement, psycho- 


physics, and utility. In Symposium on 
measurement: held by the AAAS, De- 
cember 1956. New York: Wiley, in 
press. 


. STEVENS, S. S., Carton, A. S., & SHICK- 


MAN, G. M. A scale of apparent in- 
tensity of electric shock. J. exp. Psy- 
chol., in press. 


. STEVENS, S. S., & GALANTER, E. H. Ratio 


scales and category scales for a dozen 
perceptual continua. J. exp. Psychol., 


54, 377-411. 


. STEVENS, S. S., MorGan, C. T., & VoiK- 


MANN, J. Theory of the neural quan- 
tum in the discrimination of loudness 
and pitch. Amer. J. Psychol., 1941, 54, 
315-355. 


. Stevens, S. S., & VOLKMANN, J. The 


quantum of sensory discrimination. 
Science, 1940, 92, 583-585. 


. TANNER, W. P., & Swets, J. A. A de- 


cision-making theory of visual detec- 
tion. Psychol. Rev., 1954, 61, 401-409. 


. Zwicker, E., & FELDTKELLER, R. Uber 


die Lautstirke von gleichférmigen 
Geraiischen. Acustica, 1955, 5, 303- 
316. 


. Zwicker, E., & Katser, W. Der Verlauf 


der Modulationsschwellen in der Hoér- 
fliche. Acustica 2, Akust. Beih., 1952, 
4, AB 239-246. 


Received January 2, 1958. 








PsYCHOLOGICAL BULLETIN 
VoL, 55, No. 4, 1958 


PSYCHOLOGICAL INVESTIGATIONS OF COGNITIVE DEFICIT 


IN ELDERLY 


PSYCHIATRIC 


PATIENTS! 


JAMES INGLIS 
Institute of Psychiatry, Maudsley Hospital, London 


The view has long been accepted 
(88) that disturbances of two ‘‘com- 
ponents” of cognitive functions are of 
especial importance in the psychiatric 
disorders of later life. Attention has 
been directed to the abnormal falling 
away of general intelligence from a 
previously higher level of efficiency, 
and to a failure of memory, both of 
which seem to take place to a patho- 
logical degree in some elderly persons. 

In the present paper the evidence 
from objective studies of these func- 
tions in elderly psychiatric patients 
will be reviewed. It is hoped that this 
review will provide a useful supple- 
ment to previous reviews of similar 
topics (cf. Eysenck [29], Granick 
[34], Grewel [36], Dérken [24], and 
Jones and Kaplan [47]) by covering 
the more recent literature, by trying 
to ascertain if any consistent rela- 
tionships can be discussed within the 
experimental findings, and by at- 
tempting to select some framework 
of psychological theory which might 
be able to comprehend such relations 
as do emerge. 


DECLINE OF GENERAL INTELLEC- 
TUAL FUNCTION 

The studies to be examined here 
may be discussed under two heads: 
(a) studies mainly concerned with 
comparisons of fests, (b) studies 
mainly concerned with comparisons 
of groups. 


' The writer wishes to acknowledge sugges- 
tions and criticisms made by M. B. Shapiro 
and F. Post. H. G. Jones and V. Meyer also 
read the manuscript and offered very helpful 
comments. 


Test comparisons. Babcock (3) was 
the first systematically to attempt to 
measure intellectual deficit compar- 
ing a score on a vocabulary test, sup- 
posing this to be relatively impervi- 
ous to decline and therefore to repre- 
sent a “high-water mark”’ of intelli- 
gence, with scores on other kinds of 
tests more affected by the process of 
deterioration. 

srody (11) adapted Babcock’s test 
for use with elderly psychiatric pa- 
tients. In one study he tested 83 pa- 
tients (age range, 50-69) who fell into 
four “categories” of severity of de- 
mentia, A to D, D being the most 
severe; these being defined by clinical 
ratings made by doctors and nurses. 
He was able to show significant dif- 
ferences between the “discrepancy 
scores” of these groups and went on 
to suggest that this test might be used 
diagnostically. Those patients mak- 
ing a score lower than 25 would be 
classified, he suggested, as “prob- 
ably not or doubtfully demented,” 
those scoring over 40 would be ‘“‘prob- 
ably seriously demented,” those be- 
tween 25 and 40 being ‘‘mildly or 
moderately demented.” However, 
inspection of this author's tables (11, 
p. 321) reveals the range of scores for 
his criterion groups (assuming nor- 
mality of distribution) shown in 
Table 1. 

The same author (Brody [12]) tried 
to see if there were a ‘‘psychometric 
pattern” characteristic of dementia, 
using Babcock's tests and others. He 
noted that: 

The only positive conclusion suggested by 
study of the psychometric pattern in the pres- 


197 











198 
TABLE 1 


THE RANGES OF Bropy’'s “DISCREPANCY 
Scores” CALCULATED FROM 
MEANS AND SDs. 


Group M+2SD Range 

.24 to 39.44 
27.19 +2(11.26) 4.67 to 49.71 
46.25+2(13.24) 19.77 to 72.73 


A 16.10+2(11.67) —7 
B+C 


ent patients is that while, in dementia, vo- 
cabulary ability is comparatively well pre- 
served, other abilities severely decline, level- 
ling down in a fashion which obscures the 
common pattern in simple psychosis and nor- 
mal senility. Otherwise there is no trace of a 
specific pattern of abilities in dementia (12, 


p. 519). 


A somewhat similar approach was 
adopted by Halstead (38) who gave 
a battery of 25 items to 20 senile pa- 
tients (age range, 68 to 83). He was 
able to divide these tests into 3 
groups in terms of the difficulty these 
patients showed in coping with them. 
Least difficult were those involving 
old mental habits, visual recognition 
and simple motor tasks. Vocabuiary, 
simple arithmetic, rote memory and 
fluency came next. The most difficult 
tests were those which required the 
subjects to break away from old 
mental habits. When, however, Hal- 
stead (39) gave the same tests to 18 
further patients (age range, 70 to 83) 
judged clinically to be more de- 
mented, they were shown to be worse 
than the first group on nearly every 
test, including the vocabulary scale. 

Such a finding vitiates the use of 
vocabulary as a measure of “previous 
level,’’ at least in individual cases. 
Theoretical weaknesses in the use of 
this kind of measure have previously 
been pointed out by Yacorzynski 
(80), and others have been examined 
more recently by Yates (81). Em- 
pirical evidence concerning the use- 
fulness of vocabulary estimates in 
this connection have been provided 


JAMES INGLIS 


by other workers. Trueblood (72), 
for example, found deterioration of 
language in 25 senile patients (ages 
not given). Shakow, Dolkart and 
Goldman (65) found vocabulary per- 
formance in 56 arteriosclerotics and 
48 senile psychotics (age range, about 
55 to 85) more disturbed than could 
be accounted for on the basis of age 
alone. Ackelsberg (2) tested 50 senile 
patients (age range, 60 to 85) grouped 
as “‘least,”’ “‘mildiy,”’ and ‘‘most”’ de- 
teriorated. The factors of age, educa- 
tion, etc., were controlled as well as 
possible between the groups. Five 
different kinds of vocabulary test 
were used and the results showed that 
vocabulary functioning did mot re- 
main stable but, on the contrary, 
discriminated significantly between 
these groups of patients. Roth and 
Hopkins (64) reported that 14 senile 
psychotics (age range, 67 to 8&5) 
whom they tested were largely infe- 
rior on a vocabulary test to a group of 
84 affective psychotics (age range, 60 
to 86). Orme (55) has shown a group 
of 25 senile dements (mean age, 
73.84) to be significantly worse on a 
vocabulary test than a group of 25 
elderly depressive patients (mean 
age, 68.17). There was, however, also 
a significant difference between the 
mean ages of the groups in the latter 
study. Pichot (57) suggested, how- 
ever, that verbal deterioration may 
take place to a more marked degree 
in cerebral arteriosclerosis than in 
senile dementia. He showed that 
when 27 arteriosclerotics (mean age, 
65.22) were matched with 25 senile 
dements (mean age, 73.08) in terms 
of general intellectual status on 
Raven's Progressive Matrices (61) 
the arteriosclerotic patients had a 
very significantly poorer Vocabulary 
score. 

It may be concluded, therefore, 
that while it is possible that verbal 








ability may decline more slowly than 
certain other abilities there can be 
little doubt that it, too, can be af- 
fected by the processes of deteriora- 
tion. 

M. D. Eysenck (28) analysed the 
performance of 100 clinically diag- 
nosed senile dements (mean age, 73, 
SD, 6.5) on the Matrices test. She 
wished to discover if these patients 
differed from the normal standardiza- 
tion population not only in terms of 
a lower total score but also in terms 
of qualitative characteristics such as 
order of difficulty of items within the 
test and most frequent errors made. 
She was able to show that while these 
patients made low scores, approxi- 
mately equal to the lowest 3 or 4% 
of the normal adult population on 
whom the test was standardized, they 
were not, in fact, qualitatively dif- 
ferent from children or normal adults 
in the respects measured. Bromley 
(13) similarly examined the results on 
the Matrices test of 35 elderly psychi- 
atric patients (mean age, 61, SD, 
11.3) including 8 diagnosed as senile, 
10 paranoid states, 12 depressives, 
and 5 organics. His results agreed, in 
the main, with those obtained by 
Eysenck, in showing a diminished 
total quantitative score, but little 
qualitative difference from normals. 

Eysenck (27) also factor an- 
alysed the results of a battery of 20 
tests, including the Matrices, given 
to 84 senile patients (mean age 73.4, 
SD, 6.5). She extracted one general 
and 3 group factors (speed, memory, 
and strength) accounting for 43°% of 
the variance in all. Her main conclu- 
sion from this analysis was that the 
general factor extracted gave a dif- 
ferent picture of the mental organiza- 
tion of such patients as compared 
with normal adults. The test with 
the highest saturation on the general 
factor was Vocabulary (.71) whereas 


COGNITIVE DEFICIT IN ELDERLY PSYCHIATRIC PATIENTS 


199 


the Matrices showed a low saturation 
(.34). Eysenck attempted to explain 
these results in terms of Cattell’s (21) 
notions regarding ‘‘fluid” and “crys- 
tallized"’ intelligence. She concluded 
that, on the basis of these concep- 
tions: 

. we should expect tests of fluid ability, 
such as the matrix, to have very high factor 
saturations in a study of adolescents and 
young people generally but to give low factor 
saturations in extreme old age; while, con- 
versely, crystallized abilities would give com- 
paratively high factor saturations in old age. 
This is precisely what we do find in this analy- 
sis, and we may, therefore consider that this 
analysis gives a certain measure of support to 
the theory set forth by Cattell (27, p. 18). 


Pinkerton and Kelly (58), also 
used the Matrices and examined the 
results of 40 patients (age range, 60 
to 90) clinically graded into five 
groups from A to E, E being the most 
demented. These authors attempted 
to evaluate not only extent of intel- 
lectual deterioration but also, ‘‘the 
degree of emotional adjustment to 
that deterioration.”’ In their study, 
quantitative scoring showed signifi- 
cant differences only between groups 
A and E. Their qualitative, emo- 
tional judgment score involved prin- 
cipally the kind of error made by 
patients after their last correct choice. 
It was suggested that: ““By determin- 
ing the ratio between the percentage 
marks up to the limit of problem dif- 
ficulty reached, and the percentage 
score in the problems beyond that 
point, a rough index of emotional ad- 
justment to deterioration is given”’ 
(58, p. 251). It seems to the writer 
that while it might be of interest to 
obtain further external evidence con- 
cerning the meaningfulness of such a 
score, merely to-have these authors’ 
demonstration that on the whole, the 
more demented obtain a better score 
on emotional adjustment, makes one 
rather sceptical of its ‘‘reality value.” 











200 


Cleveland and Dysinger (22) tested 
20 senile patients (age range 64 to 83) 
on the Wechsler-Bellevue Intelli- 
gence Scale (74) and on the Gold- 
stein-Scheerer Object Sorting Test 
' (33). They showed that these pa- 
tients tended to experience marked 
difficulty on the Performance Scale 
of the Wechsler and that this diffi- 
culty was commonly associated with 
a complete inability to handle the 
kind of abstractions required for suc- 
cess on the sorting test, even if the 
patient were of fairly average ability 
on the Wechsler Verbal Scale. 

Lovett Doust et al. (52) also used 
the Wechsler scale in a study de- 
signed to examine the relations be- 
tween psychometric status and ar- 
terial oxygen saturation (measured 
spectroscopically). The subjects were 
83 patients (age range, 62 to 88) so 
selected clinically as to represent a 
“continuum of senile dementia.”” A 
highly significant correlation (r =.47) 


was found between the psychological 


and physiological indices. It is of 
interest to note that the mean subtest 
intercorrelation (r=.63) for this ab- 
normal group was higher than that 
reported for the normal population 
on whom the test was standardized. 

Lassen et al. (50) also showed a re- 
lationship between the clinical esti- 
mate of dementia, cerebral oxygen 
consumption and performance in 19 
patients (18 of whom were between 
42 and 67) on the Goldstein Scheerer 
Block Design test. 

Silverman et al. (67) in one of a 
series of studies which included 56 
hospitalized senile patients (all over 
60) showed a relation between intel- 
lectual efficiency and EEG findings 
such that elderly persons with ‘‘mixed”’ 
or “‘diffuse’’ EEGs did worse on psy- 
chological tests, including the Wechs- 
ler, than those with ‘‘normal’’ or 
“focal’’ EEGs. These authors also 


JAMES INGLIS 


showed that the ‘‘mixed”’ or 
fuse’”’ 


“dif- 
picture predominated in the 
hospitalized senile patients. 

Group comparisons. A compara- 
tive study involving several groups 
of elderly psychiatric patients is re- 
ported by Rabin (59). He compared 
the Wechsler results of four groups, 
clinically diagnosed as senile (V = 15), 
arteriosclerotic (V =30), miscellane- 
ous (V=30), and nonpsychotic (NV 
=25). He failed to find significant 
differences in subtest patterning be- 
tween the groups, but the total quan- 
titative scores were lower in the senile 
and cerebral arteriosclerotic groups. 
The Verbal Performance discrepan- 
cies also varied in size, being largest 
in the seniles, next largest in the 
cerebral arteriosclerotics, less in the 
miscellaneous group and least in the 
nonpsychotics. It should be re- 
marked, however, that Rabin re- 
ported differences in mean age be- 
tween the groups, although not 
enough data are presented to enable 
the significance of this difference to 
be checked. Rabin himself concluded 
that age might have been the relevant 
factor which produced the found dif- 
ferences in performance. 

Botwinick and Birren (9) ex- 
amined the validity of three separate 
indices sometimes used to detect ab- 
normal intellectual deterioration in 
the elderly. These were, (a) the De- 
terioration Quotient (DQ) derived 
from the Wechsler-Bellevue by com- 
paring “hold” against “‘don’t hold” 
subtests, (b) the Efficiency Index * 
(EI) from the Babcock test, and (c) 
the Senescent Decline Formula de- 
vised by Copple (23) and based on 
the Wechsler. The experimental 
group comprised 31 hospitalized pa- 
tients clinically diagnosed as psycho- 
sis with cerebral arteriosclerosis or 
senile psychosis (age range, 60 to 70). 
The control group consisted of 50 








normal elderly persons (age range, 60 
to 69) (cf. Fox and Birren [31]). Sig- 
nificant differences were found be- 
tween the patients and the controls 
on the EI and SDF measures, but 
not on the DQ measure. These au- 
thors concluded that: 

The measure which best differentiated the 
two groups was the EI. This index contains a 
category called ‘“‘initial learning’? which is 
comprised of three subtests. The difference 
between the two groups for this category was 
greater than the differences in any of the 
Wechsler-Betlevue subtests. Future attempts 
to refine measures of intellectual deficit might 


well exploit the use of such learning subtests 
(31, p. 148). 


In another paper Botwinick and 
Birren (10) compared the same pa- 
tients and controls in terms of their 
differential performance on each of 
the Wechsler-Bellevue — subtests. 
They were able to show that the 
amount of decline in any subtest with 
age was not necessarily a criterion of 
that subtest’s ability to differentiate 
between the experimental and con- 
trol groups. 

Birren (7) factor analysed the 
Wechsler results of these 31 patients 
and also the results of a group of 99 
normal elderly persons (age range, 60 
to 74). Four factors were extracted 
from the correlation matrices; these 
were labelled Verbal Comprehension, 
Closure, Rote Memory and Induc- 
tion. Before this analysis took place, 
however, it was found that the me- 
dian correlation between subtests for 
the abnormal (r=.63) was higher 
than that for the normal group (r 
=.53). This is in close agreement 
with the higher intercorrelation for 
an abnormal group found by Lovett 
Doust et al. (52). 

Dérken and Greenbloom (25) com- 
pared the Wechsler results of 67 pa- 
tients (age range, 66 to 89) with a 
clinical diagnosis of either senile psy- 
chosis or cerebral arteriosclerosis 


COGNITIVE DEFICIT IN ELDERLY PSYCHIATRIC PATIENTS 


201 


with the results of a control group of 
20 normal subjects (age range, 65 to 
80). They found that the abnormal 
group were lower in_ intelligence, 
showed larger discrepancies between 
the Verbal and Performance scale re- 
sults and also had higher intercorrela- 
tions between their subtest scores. 
They also found Copple’s SDF to be 
a more valid measure than Wechsler’s 
DQ. 

Robertson and Wibberley (63) 
gave a large battery of tests to 24 
housewives (age range, 40 to 60); 
eight of these were clinically stated 
to be “demented,” eight were dull but 
not deteriorated and eight were de- 
fective but not deteriorated. The 
original ability of these patients was 
very carefully established by bio- 
graphical enquiry and the independ- 
ent ratings of several judges. It was 
found that the original ability of the 
demented group was best reflected in 
tests involving verbal material while 
their deterioration showed itself in 
material involving visuo-spatial abil- 
ity. 

Roth and Hopkins (64) compared 
the results of 14 senile psychotics 
(median age 78) with those of 46 
elderly ‘functional’ patients (me- 
dian age 67) on four tests, Wechsler 
Vocabulary, Digit Span, Raven's 
Matrices and an Information test. 
Since these groups differed in age, a 
further control group from a general 
hospital, comprising 14 elderly pa- 
tients of the same age as the senile 
group, was also tested. Significant 
differences were found between the 
functional and normal subjects on the 
one hand and the senile psychotics on 
the other on all four tests. However, 


while the ability of the seniles did not 
overlap with the other groups at all, 
for example, on the Matrices, their 
ability to define words and repeat 
digits was relatively well preserved 











202 


in some cases. Hopkins and Roth (42) 
compared the above results with 
those of further groups of elderly psy- 
chiatric patients diagnosed as para- 
phrenic (.V = 15), arteriosclerotic psy- 
chosis (.V =22), and as acute confu- 
sional state (V=17); there was no 
significant difference in age between 
these groups. Test scores placed the 
paraphrenics and confusional states 
with the functional patients, while 
the arteriosclerotics overlapped both 
with these and the senile group. 
Orme (55) examined the results of 
25 clinically diagnosed cases of senile 
dementia (mean age, 73.84) and a 
group of 24 elderly depressed patients 
(mean age, 68.17) and two groups of 
healthy old people (56) on the Ma- 
trices and the Mill Hill Vocabulary 
Scale (Raven [60]). Significant dif- 


ferences were shown between these 
groups on both tests; they were, how- 
ever, also significantly different in 
mean age. Orme claims that the dif- 


ferences found between the tests can- 
not be accounted for in terms of the 
age difference, since the only signifi- 
cant correlation with age is between 
it and the vocabulary scale within the 
senile group alone. He also suggests, 
contrary to the usual beliefs about 
intellectual deterioration, that a de- 
cline in verbal ability may be the 
most fundamental characteristic. 
This latter argument he supports by 
showing the groups to be less dif- 
ferent on a score of ‘‘intellectual po- 
tential’’ (derived from the Matrices 
by taking as a score the last point 
where a particular problem was 
solved on two occasions of testing) 
than on a score of “intellectual pro- 
ductivity” (or the simple sum of 
items correct on the test). “By this 
method,” the author states, ‘‘it is 
possible to assess each patient’s maxi- 
mum potential of intellectual ability 
disregarding losses during perform- 


JAMES INGLIS 


ance caused by an apparent fluctua- 
tion in ability’ (55, p. 865). What- 
ever the merits of this measure, it 
seems to the present author that 
Orme’s contention is somewhat weak- 
ened by the fact that only within the 
depressed group are there significant 
correlations between either ‘‘poten- 
tial’”’ and the Vocabulary score or 
“productivity” and the Vocabulary 
score. If, as Orme claimed, difference 
in verbal ability determined the dis- 
turbance in intellectual productivity, 
then it seems that significant correla- 
tions would be expected to appear be- 
tween at least the “productivity” 
measure and the Vocabulary score 
within the senile group itself. 

In addition to the comparative 
work reported above, which has 
mainly used fairly well-standardized 
tests, a few studies have been re- 
ported which have employed less 
well-standardized material for the 
examination of cognitive functions. 

Thus, Hall (37) studied 70 patients 
(age range, 41 to 65) categorized as 
“definitely organic,” “definitely de- 
pressive,”’ or ‘‘doubtful’”’ on the basis 
of physical, CSF, AEG and EEG ex- 
amination. He found that conceptual 
tests, including a block sorting test, 
differentiated best between the 
groups. 

On the other hand Hopkins and 
Post (41) showed that while 49 psy- 
chiatric patients (age range, 60 to 
84) were, on the whole, less capable 
on such tasks as the Goldstein- 
Scheerer Cubes and Colour Form 
Sorting Test than 49 normal controls 
(age range, 60 to 87), there were no 
striking differences within subclasses 
of the psychiatric group. They there- 
fore conclude that: 


The differentiating value of the Goldstein 
tests used appears to be limited to the nega- 
tive finding that preservation of the abstract 
attitude in elderly people makes the presence 








of organic cerebral disorder unlikely. Failure 
In test performance does not necessarily indi- 
cate the of brain damage (41, p. 
849). 


presence 


Thaler (71) also found that 66° 
of her group of 116 normal old people 
(mean age, 73.04; SD, 7.53) 
formed concretely on the 
Form Sorting test. 


per- 
Colour 


Studies of more specific aspects of 
mental function in elderly psychi- 
atric patients include those of Birren 
and Botwinick (8) who examined the 
effects of age and senile psychosis 
on motor speed (in writing digits and 
words). They contrasted the results 
of 35 patients (age range, 60 to 70) 
institutionalized for senile psychosis 
or cerebral arteriosclerosis with psy- 
chosis with the results of a control 
group of normal persons of the same 
age. Statistically significant differ- 
ences were found in writing speed, 
the patients being slower than the 
normals. 

Williams (77) has studied the dif- 
ferences in responses made to visual 
stimuli by groups of patients diag- 
nosed as suffering from early senile 
dementia and mentally normal eld- 
erly persons. The former groups per- 
formed more poorly, only improving 
if the tasks were simplified or if addi- 
tional cues were presented. 

Discussion. 


The main trends 
which : 


from the review of 
studies of decline in general intellec- 
tual functioning seem to be 
lows. 


emerge 
as fol- 


1. Given the clinical diagnosis of 
dementia, psychometric studies in 
general show patients so categorized 
to be poorer intellectually than con- 
trol groups of other elderly persons, 
whether ‘functional psychotics” or 
normals. Ina way, this is hardly sur- 
prising since one of the main criteria 
commonly used for placing a patient 
in the ‘‘dement”’ category is a clinical 


COGNITIVE DEFICIT IN ELDERLY PSYCHIATRIC PATIENTS 


203 


impression of cognitive impairment. 
Such a process of indirect contamina- 
tion has been studied by Shapiro et 
al. (66) and has been shown to be sig- 
nificant. However, this latter process 
is only of major importance if a hy- 
pothesis is being tested, e.g., that 
such deficit is due to organic aetiol- 
ogy. It may be of less importance 
when the purpose of the study is 
mainly descriptive. 

2. Some have, however, 
shown a relation between psychologi- 
cal adequacy and physiological ef- 
ficiency, confirming that the main- 
tenance of general intellectual func- 
tioning may depend to some degree 
on cerebral metabolism, especially on 
the cerebral uptake of oxygen (50, 
52). 

3. Some studies have demonstrated 
that higher intercorrelations may be 
found in intellectual test performance 
among abnormal groups (7, 52). 

4. The existence of some kind of 
‘“verbal-performance discrepancy” in 


studies 


cases of senile deterioration has been 
found in a number of studies (10, 11, 
22, 25, 38, 59, 63). Other studies, 
however, have made the important 
point that verbal ability can by no 
means be regarded as an infallible, 
constant “intellectual high-water 
mark" and have shown that verbal 
ability may itself be affected by the 
process of deterioration (2, 39, 55, 
57, 64, 65, 72). 

5. The relation of such findings as 
these latter to differences between 
“fluid” and “‘erystallized”’ ability and 
the relation of these, in turn, to new 
and old learning has been emphasized 
by Evysenck (27). Birren also has 
noted that: 


Intelligence tests such as the Wechsler- 
Bellevue seem to measure some diffuse combi- 
nation of (a) the mass of previously learned or 
stored information and (0) abilities to acquire 
Success on a test item would 
seem to be determined by whether a similar 


information. 











204 


bit of information has previously been pre- 
sented to the individual, by his capacity to 
store it, and by his ability to associate a cur- 
rently presented bit of information with rele- 
’ vant stored information and pertinent infor- 
mation in the context of the item. As an indi- 
vidual grows older his environment changes in 
the relative frequency of occurrence of certain 
types of information. In addition to the age 
changes in frequency with which certain in- 
formation occurs or is reinforced, there is the 
possibility that the individual suffers a di- 
minution in abilities. He may not learn as 
rapidly or associate a presented bit of in- 
formation with his stored information... . 
The close association of test performance in 
the elderly with stored information suggests 
that to a greater extent than in young sub- 
jects their performance is determined by what 
they already know than by what new informa- 
tion they can elicit from the test situation. 
With senile brain changes there is not only a 
further reduction in the ability to extract new 
information but there is also a large loss of 
stored information as well (7, pp. 403-405). 


Psychological studies point, in fact, 
to a connection between intellectual 
deterioration and some disorder in 
learning ability. It remains to be seen 
whether evidence for such a connec- 
tion between these two aspects of 
function emerges, so to speak, from 
the other side, in studies of “‘memory 
disorder” itself. 


MEMORY DISORDER 


Before psychological studies in this 
area are examined it is necessary to 
emphasize two considerations, one 
mainly theoretical, the other more 
practical, both concerned with the 
notion of memory in general. 

In the first place, neither the status 
of ‘‘memory function” as a construct, 
nor its relation to other cognitive 
abilities has ever been clearly de- 
cided. Clinically it has commonly 
been considered as distinct from gen- 
eral intellectual functioning but as 
Spearman (69) emphasized many 
years ago, clinical usage need not al- 
ways be an ideal guide in such mat- 
ters. Simmins (68), for example, 


JAMES INGLIS 


found in a sample of 200 mental hos- 
pital patients that when g scores were 
partialled out from results on so- 
called memory tests, the ability of 
over 90°%, of her patients fell within 
the normal range. Eysenck and Hal- 
stead (26) showed in a normal group 
of 60 young soldiers that most of the 
variance on several ‘‘memory tests” 
could be accounted for in terms of 
general intelligence. 

However, a study by Inglis et al. 
(46) has suggested that the notion of 
‘“‘memory function”? may be a useful 
one in the study of the behaviour of 
elderly psychiatric patients even 
when general intellectual level has 
been partialled out of their perform- 
ance. 

Secondly, as Hull (43) pointed out 
more than 40 years ago, the notion of 
‘“‘memory’’ as commonly used, is too 
narrow in meaning, having come to 
refer almost exclusively to the process 
of reproduction of learned material. 
It might be better to substitute 
“learning ability’? for memory even 
in clinical discussions, since the 
former expression is wider in connota- 
tion. It admits of several aspects and 
takes into account, for example, at 
least the two broad phases of “‘fixa- 
tion” or ‘‘acquisition,” and “reten- 
tion.”’ So long as it is clear that the 
distinction between these phases is 
not absolute, as McGeoch and Irion 
(53) have pointed out, it is likely that 
‘learning’ will be a more convenient 
term than ‘“‘memory”’ for the analysis 
of the behaviour concerned. 

Bearing these considerations in 
mind, studies which have attempted 
experimentally to examine learning 
ability in elderly psychiatric patients 
may be examined in chronological 
order. 

The results of the earliest studies 
can, at best, be regarded as only sug- 
gestive. They commonly employed 








very few, and very mixed cases, they 
usually omitted to control important 
independent variables and seldom 
carried out any statistical checks. 
Among such studies is a report by 
Achilles (1) who examined, among 
other subjects, 16 mixed, mainly eld- 
erly, psychiatric cases (age range, 
about 36 to 84), suffering 
syndrome, 


from 
arterioscle- 
rosis, and GPI. The subjects were not 
required to learn the material used up 
“to a set criterion but were simply 
given 2 presentations of two sets of 
words and one set each of “forms,” 
“proverbs,”’ and “‘syllables."’. Achil- 
les concluded that, ““No attempt will 
be made .. . to average the results of 
the subjects in the insane group but 

. all show a memory defect and 
the defect is present in both recall 
and recognition” (1, p. 64). 

Moore (54) reported a study of 30 
patients (age range, 28 to 74, 14 of 
whom were over 50) including cases 
suffering from GPI, senile dementia 
and alcoholic deterioration. These 
subjects were presented once with 8 
stimuli in each of 4 series, real ob- 
jects, pictures of objects, and printed 
and spoken words. These had to be 
recalled immediately and then again 
one minute later. The patients were 
also required to name 50 pictures. 
Moore found evidence suggesting 
that, while immediate memory, re- 
tention, and perception all tend to 
deteriorate together, nevertheless, 
immediate memory and retention 
represent ‘‘two distinct mental func- 
tions.” 

Liljencrants (51) examined only 4 
abnormal cases (ages 40, 53, 54, and 
60) and concluded that both appre- 
hension and reproduction are affected 
in cases of memory disorder when the 
patients are required to recall or rec- 
ognize meaningful or nonmeaningful 
visual stimuli. 


Korsakoff's 


COGNITIVE DEFICIT IN ELDERLY PSYCHIATRIC PATIENTS 


205 


Wylie (79) studied seven senile, 
three presenile, and six paretic, “‘de- 
teriorated”’ cases (age range, 29 to 87, 
11 of whom were over 50). Her re- 
sults suggested that such patients 
could neither acquire nor retain in 
situations in which they were re- 
quired to remember pictures or to 
learn paired including 
meaningful, nonmeaningful and non- 
sense material. They failed both in 
the recognition and recall of such ma- 
terial. 

Hunt (44) noted in his review of 
such studies that: “The control of 
most of the psychological governors is 
inadequate because it is not under- 
stood. Even the conception of mem- 
ory varies. Nevertheless, certain 
facts stand out”’ (44, p. 13). Two of 
these outstanding facts are of im- 
portance here: (a) ‘‘impression”’ often 
suffers more than retention, and ()) 
inefficiency is evident in tests of both 
recall and recognition. 

An example of a more sophisticated 
approach to the study of memory dis- 
order in elderly psychiatric patients 
is provided by the work of Shakow, 
Dolkart, and Goldman (65). They 
used a version of the Wells Memory 
Test (Wells and Martin [76]), which 
involves both old recall (e.g., of per- 
sonal information) and new recall 
(e.g., memory for a sentence). They 
examined 56 cerebral arteriosclerotics 
with psychosis (age range, approxi- 
mately 55 to 75) and 48 senile psy- 
chotics (age range, approximately 56 
to 85). The performance of these 
groups was compared with that of 
normal controls selected from the 
same age range of 192 normals (age 
15 to 90) also tested. The 


associates, 


range, 


only control for general level was in 
terms of previous education since the 
authors found both abnormal groups 
were lower in Vocabulary score than 
They found consist- 


the normals. 











206 


ently, although not markedly, poorer 
results on these memory tests in the 
psychotic groups. All groups were 
better on old than on new recall and 
seniles were poorer than arterio- 
sclerotics on both tests. The authors 
do not, however, provide assessments 
of the statistical significance of these 
differences. It is of interest that in 
this study also there were higher in- 
tercorrelations between tests in the 
senile group than in the normals or 
the arteriosclerotics. 

D. E. Cameron also conducted a 
series of investigations into ‘memory 
disorder” in aged patients. In one 
study (17) he tested 33 elderly pa- 
tients (ages not given) showing mem- 
ory defect. Eighteen of these were 
given pairs of drawings to copy and 
also to draw from recall. Cameron 


found great ‘interference’ between 
the pairs and suggested that this was 
due toa greater tendency to persev- 


erate. Fifteen other patients were 
asked to name two simple objects 
which had been presented on three 
successive occasions. They showed a 
tendency to “‘secondary elaboration” 
of the material. 
cludes that, 


The author con- 


. .. in patients suffering from the psychoses 
of the senium there are a greatly increased 
tendency to perseverate and a greatly ac- 
celerated tendency to secondary elaboration 
of memorized data. The first process, by in- 
terfering primarily with registration, and the 
second process, by interfering with retention, 
contribute materially to the impoverishment 
of recent memory in these patients (17, 
p. 992). 


The same investigator (Cameron 
[18]) showed that, in 16 elderly pa- 
tients (ages not given) suffering noc- 
turnal delirium, their wandering and 
confusion could be brought on by 
their being placed in a darkened room 
during the day. This demonstrated 
that their condition was not brought 
on by fatigue, as has sometimes been 


JAMES INGLIS 


supposed, but by cue-deprivation. 
Since these patients also showed se- 
vere learning disability Cameron sug- 
gested that such delirium may be 
based on an inability to maintain a 
spatial image without the assistance 
of repeated visual stimulation. Thir- 
teen of these patients also showed a 
severe distortion of spatial imagery 
(e.g., in the displacement of the re- 
membered positioning of environ- 
mental objects) within an hour of be- 
ing blindfolded. 

Cameron (19) also compared 12 
seniles (ages not given) with memory 
defect with a group of younger per- 
sons on a test of memorizing 3 digits. 
These they were required to repro- 
duce after varying periods of time 
(cf. Feldman and Cameron [30]). If 
the interval between acquisition and 
reproduction was unfilled, the seniles 
could remember the series for brief 
periods. If the period was occupied 
by another task, however, they failed 
in recall, unlike the controls 
showed no such difficulty. 

In all, Cameron gives emphasis to 
the importance of disturbances of the 
early stages of retention, or what the 
present author would prefer to call 
the ‘‘acquisition”’ stage in learning. 

Cameron et al. (20) also showed in 
23 patients (age range, 60 to 87) suf- 
fering from psychoses of the senium 
characterized by ‘“‘memory defect,”’ 
that they also suffered defects of 
cerebral oxygen metabolism. 

A study by Kral and Durost (48) 
using the Wechsler Memory Scale 
(75) included 10 senile dements 
(mean age, 78) and 10 Korsakoff’s 
syndrome (mean age, 51.5). They 
found impairment of immediate re- 
call and recent memory in both these 
groups. The control group in this 
study was inadequate, comprising 10 
hospital personnel with a mean age of 
only 29. 


who 








Williams (78) demonstrated in a 
study of 60 senile dements (age range, 
65 to 95) that such patients may ex- 
perience gross difficulties in learning 
even the simplest paper-maze tasks, 
although their performance can be 
aided to some degree by recent per- 
formance on similar tasks. No con- 
trol data are quoted in this study. 

At least one further investigation 
has demonstrated a relation between 
learning ability and physiological 
efficiency in relation to the oxygen 
uptake of the brain. Lassen et al. 
(50) using 19 patients (18 of whom 
were between 42 and 67) found a re- 
lationship between the clinical im- 
pression of dementia, cerebral oxygen 
metabolic rate, and the ability to 
learn a series of digits when this was 
only slightly longer than the immedi- 
ate memory span. These authors re- 
ported the results on this test in 
terms of a “fluctuation score.’ They 
found Jeast fluctuation in the nonde- 
mented and _ the demented 
groups, fluctuation appearing only 
in the middle range. It seems to the 
present writer that this finding may 
have been due to the fact that the 
Was too for one extreme 
group and too difficult for the other, 
variations only being possible in the 
middle group. It is likely that a more 
meaningful score would have been 
obtained if they had counted the 
number of trials to the criterion of 
learning. Zangwill (82) previously 
found this method to provide a useful 
test of learning in organic cases. 

Robertson (62) has recently re- 
ported an experiment in which, 
among other subgroups, 26 elderly 
patients (age range, 60 to 79) diag- 
nosed as suffering from various kinds 
of brain damage including ‘‘organic 
senile dementia’ were compared with 
40 non-brain-damaged, psychotic pa- 
tients (age range, 60 to 79). The ex- 


most 


test easy 


COGNITIVE DEFICIT IN ELDERLY PSYCHIATRIC PATIENTS 


207 


perimental group were found to be 
significantly worse on tests of paired 
associate learning. When both groups 
were further subdivided into a “low 
vocabulary"? (Wechsler Vocabulary 
score, less than 23) and a “‘high vocab- 
ulary”’ group (23 or above), it was 
shown that the low vocabulary 
groups were also significantly dif- 
ferent on a rote-learning test, and the 
high vocabulary groups almost so. 
This study clearly demonstrated the 
need for controlling at least the 
factors of verbal ability and age in 
any experimental study of learning 
ability. 

Discussion. From this review of 
the experimental work concerned 
with disturbances of learning ability 
in elderly psychiatric patients, it may 
be seen that there are certain find- 
ings or features in common with the 
studies of intellectual deterioration 
already reviewed. 

1. Given the clinical diagnosis of 
dementia, patients so categorized 
can be shown to suffer some disabil- 
ities in learning, both in acquisition 
and in reproduction (recognition and 
recall). 

Many of the studies reviewed 
have, however, involved poor experi- 
mental design and have failed to ad- 
here to adequate experimental tech- 
niques (e.g., they have not studied 
learning through the various sense 
modalities, nor have they set up ade- 
quate criteria of learning). 

In addition, most of these investi- 
gations are affected by the same proc- 
“indirect contamination” as 
were the studies of intellectual de- 
terioration. This is of considerable 
importance if it is intended to test 
the common psychiatric hypothesis 
concerning the ‘‘organic’’ aetiology 
of memory disorder, or if it is re- 
quired to show that there is some nec- 
essary connection between intellec- 


ess of 











208 


tual deterioration and learning dis- 
order in senility. 

2. Some studies have, however, 
suggested that learning efficiency 
may be related to physiological ade- 
quacy. Indirectly this notion is sup- 
ported by some of the findings made 
by Simmins (68) and also by Inglis 
et al. (46). More direct evidence 
comes from those studies which have 
related memory function to cerebral 
oxygen consumption (20, 50). 

3. At least one study has confirmed 
the existence of higher correlations 
between learning and other tests in 
disordered groups (65). 

4. That some important relation- 
ship exists between some aspects of 
general intellectual deterioration and 
learning disability is confirmed by a 
recent study by the present author 
(45). Using small groups of elderly 
psychiatric patients (age range, 56 to 
81) selected solely on the basts of the 
presence or absence of clinically ascer- 
tained memory disorder, and matched 
for age and verbal ability, he was able 
to demonstrate that: (a) Experimen- 
tal and control groups did not differ 
in terms of immediate memory span 
for digits forwards. (>) They did, 
however, differ very significantly in 
their rate of and degree of success in 
the acquisition of paired associate 
items, whether visually or auditorily 
presented, whether recalled or recog- 
nized. (c) The experimental group 
showed a significantly larger Verbal- 
Performance discrepancy on the 
Wechsler than the control group, and 
(d) There was a higher correlation be- 
tween such a discrepancy and learn- 
ing (i.e., the acquisition phase) in 
the experimental than in the control 
group. 

PsYCHOLOGICAL EVIDENCE FROM 

ATTEMPTS AT TREATMENT 
studies al- 


In addition to those 


JAMES INGLIS 


ready considered, in which attempts 
were made to examine psychological 
functions more or less in isolation, 
there have been reported in the liter- 
ature a number of studies which have 
been concerned to improve physio- 
logical and metabolic functioning in 
elderly psychiatric patients and to 
assess the effects of such treatment 
on psychological function. 

Three main therapeutic avenues 
have been explored; these comprise 
treatment through increase of oxygen 
utilization, vitamin therapy, and sex 
hormone administration. 

Cameron (19), for example, at- 
tempted to increase the supply of 
oxygen to the brain and also tried to 
increase the utilization of oxygen by 
the brain in elderly psychiatric pa- 
tients, with negative results. A study 
by Garnett and Klingman (32) used 
injections of an oxygen-transferring 
enzyme, Cytochrome C, in 17 mainly 
elderly psychiatric patients (age 
range, 43 to 79, 11 of whom were se- 
nile, presenile or arteriosclerotic). 
These authors claimed “clinical im- 
provement” of memory and other 
functions in most of 


these cases. 


There was, however, no significant 
change in Wechsler scores between 
pre- and posttreatment assessment. 
No untreated control group was used 
in the study. 
Cameron 


(19) also administered 
vitamins (Bi, nicotinic acid, etc.) 
without result. Stephenson et al. 
(70) had previously shown that vita- 
min treatment had beneficial effects 
at least on psychomotor performance 
in 40 senile dementia patients (age 
range, 65 to 86) as compared with 18 
untreated senile patients (age range, 
60 to 87). The outcome of a study of 
vitamin administration in a large 
group of male senile patients reported 
by Vernon and McKinley (78) was, 
however, almost wholly negative 








COGNITIVE DEFICII1 


both as regards psychomotor and in- 
tellectual test performance. On the 
other hand a paper by Gregory (35) 
concluded that vitamin therapy is ef- 
fective in senile disorders of recent 
onset. This author administered nic- 
otinic acid to his experimental group 
but used no untreated controls. He 
reported only the results of his suc- 
cessful cases and in any case his esti- 
mates of improvement were mainly 
based on clinical impressions. Al- 
though intelligence tests were used in 
some cases the tests were not de- 
scribed. In view of these facts Greg- 
ory’s conclusions must be regarded as 
speculative. A well designed study 
has recently been reported by Kra- 
wiecki et al. (49). These authors 
demonstrated improvement on the 
Wechsler Memory Scale in 25 senile 
psychotics (age range, 65 to 83) fol- 
lowing injections of the vitamin prep- 
aration, Parentrovite, as compared 
with 25 placebo-treated senile con- 
trols matched for age and pretreat- 
ment memory ability. 

The effect of sex hormone admin- 
istration on psychological functioning 
was also examined by Vernon and 
McKinley (78). Using male hormone 
in their male senile patients they 
found no effect on either psycho- 
motor or intellectual tests. The re- 
sults of a well-designed and con- 
trolled study conducted by Caldwell 
and Watson (15) however, showed 
that, in nonpsychotic cases at least, 
the administration of female sex hor- 
mones to 15 elderly women (age 
range, 54 to 88) improved their per- 
formance on the Wechsler Memory 
Scale, especially on the Paired Asso- 
ciates subtest. This improvement 
was significantly greater than that 
of a matched, placebo-treated control 
group. Further reports by Caldwell 
and Watson (16) and Caldwell (14) 
showed that the experimental group 


IN ELDERL) 


PSYCHIATRIC PATIENTS 209 
continued its improvement over at 
least a year, so long as the hormone 
administrations were continued. 
Such studies serve to confirm the 
relation already discerned from 
studies principally concerned with 
intellect and learning, that some rela- 
tion exists between physiological, 
probably metabolic, adequacy and 
psychological efficiency in the psy- 
chiatric disorders of old age. 


THEORETICAL CONSIDERATIONS 

The view has often been expressed 
that human cognitive behaviour is to 
be understood in terms of some bal- 
ance between the capacity to acquire 
and the ability to use what has been 
acquired. Thus we have the notions 
of ‘fluid’ and ‘‘crystallized”’ ability 
proposed by Cattell (21), and the 
“accommodation” and 
“assimilation” suggested by Piaget 
(Berlyne [5]). One outstanding at- 
tempt functionally to relate these 
processes to each other and to unite 
them within a theory of brain func- 
tion has been made by Hebb (40). 

Briefly, the ‘‘neuropsychological” 
theory propounded by Hebb suggests 
that short-term ‘‘memory’’ may be 
maintained by rather complex rever- 
berating neuronal circuits, while re- 
peated excitation of these would lead 
to permanent changes in the inter- 
connections of their constituent neu- 
rones, so forming the ‘‘cell-assemblies”’ 
which are proposed as the basis of 
learning (and so of longer lasting 
‘‘memory” processes). Activity of 
such circuits can be facilitated from 
sensory input or from other assem- 
blies; this latter process can, in turn, 
result in the creation of more or less 
permanent processes of interfacili- 


processes of 


tation between assemblies to form 
‘‘phase-sequences,”’ which may becon- 
sidered to provide the neural basis of 
“attention” and even of “thought.” 








210 


This theory provides at least a 
speculative basis for regarding learn- 
ing and intelligence as lying along 
some ‘‘physiological continuum.” In- 
telligence itself Hebb (40, p. 275f.) 
regards as dividing into “‘intelligence 
A,” or innate potential, and ‘“‘intelli- 
gence B,"’ the functioning of the 
brain in which development and 
learning have gone on. 

Integration of this theory with the 
experimental findings reviewed above 
may first be pursued through the 
psycho-physiological relations dis- 
covered by some investigators (20, 
50, 52, 67). Hebb has pointed out 
that: 


A (brain) region in which the blood supply 
is interfered with, but not entirely shut off, 
usually shows some loss of cells and a number 
of remaining cells whose staining properties 
are changed. This indicates a change in the 
chemical properties of the cell, which in turn 
implies a change in frequency properties and 
obviously may account for the existence of a 
hypersynchrony which interferes with the 
functioning of the cell-assembly. A focus of 
hypersynchrony must act as a pacemaker that 
tends to wean transmission units away from 
the assembly. When hypersynchrony is not 
great, it would allow some assemblies to func- 
tion (particularly those that are long estab- 
lished) but would tend to interfere with recent 
memory, decrease responsiveness and inter- 
fere with complex intellectual activities. 
When it is more extensive, it would prevent all 
higher functions (40, p. 283). 


Concerning long-established assem- 
blies, Hebb notes that, 


One might assume that longer-established 
assemblies would have in general greater 


safety margins.... This would imply that 
older habits, and longer-established memories, 
would be most resistent to disruption by 
metabolic changes.of blood content (40, 
p. 197). 


It is tempting to try further to re- 
late these speculations to those of 
such workers as Eysenck (27) and 
Birren (7) regarding the reorganiza- 
tion of mental abilities in the older 
adult. 


JAMES INGLIS 


They could be used to explain not 
only why Verbal-Performance type 
discrepancies have been found, but 
also why, if the disorder is far ad- 
vanced, verbal ability (or ‘‘old learn- 
ing’) may itself decline. It may be 
suggested that the psychological ef- 
ficiency of elderly persons mainly de- 
pends on crystallized ability but 
nevertheless a modicum of “fluidity,” 
or learning ability, is also very nec- 
essary. When this latter drops below 
a certain “threshold” level not only 
is ability on learning tasks affected, 
but the type of discrepancy some- 
times noted between Verbal and Per- 
formance tasks makes its appear- 
ance, since Performance type tests re- 
quire more “‘learning”’ of instructions 
and present more unfamiliar prob- 
lems to be solved. This hypothesis is 
also congruent with the results found 
on ‘‘sorting”’ tasks (22, 37), which are 
commonly of the “concept forma- 
tion” type. 

The notion of the proposed ‘‘thresh- 
old effect” is also consistent with an 
idea put forward by Shakow et al. 
(65) who noted that, they 
found no correlation within the senile 
group between age and memory, 
““...a person who develops senile 
psychosis at 80 is an individual who 
at 80 has preserved his functions as 
well as the person who develops the 
psychosis at 60. It is as if there were 
a threshold level of preservation of 
function and when that threshold is 
reached the person is ready to suc- 
cumb to the psychosis” (65, p. 46). 

The increased correlations between 
functions frequently found (7, 52, 
65), in the abnormal groups might 
also be due to this threshold effect. 
For example, two tests, X and Y, 
may have a ‘“‘minimum requirement” 
in terms of learning, which all ‘‘nor- 
mal’’ persons possess, and otherwise 
have little variance in common. Once 


since 








the individual falls below a certain 
threshold of ability, however, the 
common, “‘learning’’ element comes 
to be of crucial importance and a 
higher correlation results. 

At a later stage even crystallized 
ability (or “intelligence B’’) could be 
disrupted with the consequent falling 
off, for example, in verbal ability. 

Attention may finally be drawn to 
a striking resemblance between the 
results of Cameron’s (18) cue-dep- 
rivation experiment with senile pa- 
tients and some experiments issuing 
from Hebb’s own laboratory. Bexton 
et al. (6), working with 22 young nor- 
mal volunteers, found that sensory 
deprivation, with its consequent dim- 
inution§ of controlling in- 
put, over relatively long periods of 
time (i.e., 2 to 3 days) resulted in 
disturbances of behaviour. 
When, as Hebb has described, the 
functioning of the assemblies is itself 
adversely affected by metabolic 
changes it might be expected that 
much shorter periods of cue-depriva- 


sensory 


gross 


COGNITIVE DEFICIT IN ELDERLY PSYCHIATRIC PATIENTS 


211 


tion would lead to disturbances of 
orientation and behavior, such as 
Cameron (18) showed in his elderly 
patients. Bartlet’s (4) review of vis- 
ual hallucinations in intellectually 
well-preserved elderly patients with 
cataract is of interest in this connec- 
tion. 


SUMMARY 


1. An attempt has been made to 
review the experimental literature to 
date on intellectual and learning im- 
pairment in elderly psychiatric pa- 
tients. 

2. Certain consistent relationships 
emerge from these studies indicative 
of possible links between learning 
ability and that differential cognitive 
impairment which is to be found in at 
least the early stages of senile de- 
terioration. 

3. The neuropsychological theory 
propounded by Hebb seems to pro- 
vide a convenient conceptual frame- 
work for the relations so far estab- 
lished. 


REFERENCES 


1. AcHILLEs, Epita M. Experimental stud- 
ies in recall and recognition. 
Psychol., 1917, 6, No. 44. 

2. ACKELSBERG, SyLv1A B. Vocabulary and 
mental deterioration in senile dementia. 
J. abnorm. soc. Psychol., 1944, 39, 393- 
406. 

3. Bascock, HARRIET. An experiment in the 
measurement of mental deterioration. 
Arch. Psychol., 1930, No. 117. 

4. BarTLeT, J. E. A. A case of organized 
visual hallucinations in an old man with 
cataract and their relation to the phe- 
nomena of the phantom limb. Brain, 
1951, 74, 363-373. 

BerRLYNE, D. E. Recent developments in 
Piaget’s work. Brit. J. educ. Psychol., 
1957, 27, 1-12. 

6. BExton, W. H., Heron, W., & Scott, 
T. H. Effects of decreased variation in 
the sensory environment. Canad. J. 
Psychol., 1954, 8, 70-76. 

. Brrren, J. E. A factorial analysis of the 
Wechsler-Bellevue scale given to an 


Arch. 


wn 


~ 


elderly population. J. consult. Psychol., 
1952, 16, 399-405. 

8. Birren, J. E., & Botrwinick, J. The rela- 
tion of writing-speed to age and to the 
senile psychoses. J. consult. Psychol., 
1951, 15, 243-249. 

9. Borwinick, J., & Brrren, J. E. The 
measurement of intellectual decline in 
the senile psychoses. J. consult. 
Psychol., 1951, 15, 145-150. 

10. Borwinick, J., & BrrreEN, J. E. Differen- 
tial decline in the Wechsler-Bellevue 
subtests in the senile psychoses. 
J. Gerontol., 1951, 6, 365-368. 

11. Bropy, M. B. The measurement of de- 
mentia. J. ment. Sci., 1942, 88, 317- 
327. 

12. Bropy, M. B. A psychometric study of 
dementia. J. ment. Sci., 1942, 88, 512- 
533. 


13. Bromiey, D. B. Primitive forms of re- 


sponses to the Matrices test. J. ment. 
Sci., 1953, 99, 374-393. 
14. CALDWELL, BEtTTYE McD. An evaluation 








JAMES INGLIS 


of psychological effects of sex hormone 

administration in aged women. II. Re- 

sults of therapy after eighteen months. 

J. Gerontol., 1954, 9, 168-174. 

. CALDWELL, BettyE McD., & Watson, 
R. I. An evaluation of psychologic 
effects of sex hormone administration 
in aged women. I. Results of therapy 
after six months. J. Gerontol., 1952, 7, 
228-244. 

“ALDWELL, BettYyE McD., & Watson, 
R. I. An evaluation of sex hormone re- 
placement in aged women. J. genet. 
Psychol., 1954, 85, 181-200. 


. Cameron, D. E. Certain aspects of de- 


fects of recent memory occurring in 


psychoses of the senium. Arch. neurol. 
Psychiat., 1940, 43, 987-992. 

. Cameron, D. E. Studies in senile noc- 
turnal delirium. Psychiat. Quart., 1941, 
15, 47-53. 

. Cameron, D. E. Impairment of the re- 
tention phase of remembering. Psy- 
chiat. Quart., 1943, 17, 395-404. 

. CAMERON, D.E., Himwicu, H.E., Rosen, 
S. R., & Fazekas, J. Oxygen consump- 
tion in the psychoses of the senium. 
Amer. J. Psychiat., 1940, 97, 566~-572. 
. CaTTELL, R. B. ‘The measurement of 
adult intelligence. Psychol. Bull., 1943, 
40, 153-192. 

. CLEVELAND, S., & DysinGer, D. Mental 
deterioration in senile psychosis. J. ab- 
norm. soc. Psychol., 1944, 39, 368-372. 
. CoppLe, G. E. Senescent decline on the 
Wechsler-Bellevue Intelligence Scale. 
Unpublished doctoral dissertation, 
Univer. Pittsburgh: 1948. 

. DoORKEN, H. Psychometric differences be- 
tween senile dementia and normal 
senescent decline. Canad. J. Psychol., 
1954, 8, 187-194. 


. DORKEN, H., & GREENBLOOM, GRACE C. - 


Psychological investigations of senile 
dementia. II. The Wechsler-Bellevue 
Adult Intelligence Scale. Geriatrics, 
1953, 8, 324-333. 

. Eysenck, H. J., & Hatsteap, H. The 
memory function. I. A factorial study 
of fifteen clinical tests. Amer. J. 
Psychiat., 1945, 102, 174-180. 

. Eysenck, M.D. An exploratory study of 
mental organization in senility. J. 
neurol. neurosurg. Psychiat., 1945, 8, 
15-22. 

. Eysenck, M. D. A study of certain quali- 
tative aspects of problem solving be- 
haviour in senile dementia patients. 
J. ment. Sci., 1945, 91, 337-345. 

. Eysencx, M. D. The psychological as- 


. Grecory, I. 


. GREWEL, F. 


. Hunt, J. McVv. 


. INGuIs, J. 


pects of ageing and senility. J. ment. 
Sci., 1946, 92, 171-181. 

. FELDMAN, F., & Cameron, D. E. The 
measurement of remembering. Amer. 
J. Psychiat., 1944, 100, 788-791. 

. Fox, CHARLOTTE, & Brrren, J. E. Intel- 
lectual deterioration in the aged: agree- 
ment between the Wechsler-Bellevue 
and the Babcock-Levy. J. consult. 
Psychol., 1950, 14, 305-310. 


. GARNETT, R. W., & KiiInGcMAN, W. O. 


Cytochrome C: effects of intravenous 
administration in presenile, senile and 
arteriosclerotic cerebral states. Amer. 


J. Psychiat., 1949, 106, 697-702. 


. GOLpsTEIN, K., & SCHEERER, M. Ab- 


stract and concrete behavior: an experi- 
mental study with special tests. Psy- 
chol. Monogr., 1941, 53, 239. 


4. GRANICK, S. Studies of psychopathology 


in later maturity—a review. J. Geron- 
tol., 1950, 5, 361-369. 

Nicotinic acid therapy in 
psychoses of senility. Amer. J. Psy- 
chiat., 1951, 108, 888-895. 

Testing psychology of de- 
mentias. Folia. psychiat. Neur., 1953, 
56, 305-339. 


. Hatt, K. R. L. Conceptual impairment 


in depressive and organic patients of the 
pre-senile age group. J. ment. Sci., 
1952, 98, 2560-264. 


. Hatsteap, H. A psychometric study of 


senility. J. ment. Sci., 1943, 89, 363- 
aaa 


. HALsTEAD, H. Mental tests in senile de- 


mentia. J. ment. Sci., 1944, 90, 720- 
726. 


. Hess, D. O. The organization of behavior: 


a neuropsychological theory. New York: 
John Wiley, 1949. 


. Hopkins, BARBARA, & Post, F. The sig- 


nificance of abstract and concrete be- 
haviour in elderly psychiatric patients 
and control subjects. J. ment. Sct., 
1955, 101, 841-850. 


. Hopkins, BarBArA, & Roto, M. Psy- 


chological test performance in patients 
over sixty. II. Paraphrenics, arterio- 
sclerotic psychosis and acute confusion. 
J. ment. Sci., 1953, 99, 451-463. 


. Hutt, C. L. The formation and retention 


of associations among the insane. 
Amer. J. Psychol., 1917, 28, 419-435. 
Psychological experi- 
ments with disordered persons. Psy- 
chol. Bull., 1936, 33, 1-58. 

An experimental study of 
learning and ‘memory function” in 


. 











46. 


un 
nm 


uw 
u 


on 
P= 


ws 


56. 


. Jones, H. E., & Kapvan, O. J. 


. Orme, J. E. 


elderly psychiatric patients. J. 
Sci., 1957, 103, 796-803. 

INGLIS, J., SHartro, M. B., & Post, F. 
“Memory function” in psychiatric pa- 
tients over 60. 
tests 


ment. 


The role of memory in 

discriminating between ‘“func- 

tional” and “organic” groups. J. ment. 

Sci., 1956, 102, 589-598. 

Psycho- 
logical aspects of mental disorders in 
later life. In O. J. Kaplan (Ed.), Men- 
tal disorders in later life. (2nd ed.) 
Stanford: Stanford Univer. Press, 1956. 

KRAL, V. A., & Durost, H. B. A com- 
parative study of the amnestic syn- 
drome in various organic conditions. 
Amer. J. Psychiat., 1953, 110, 41-47. 

KRAWIECKI, J. A., Couper, L., & 
Watton, D. The efficacy of parentro- 
vite in the treatment of a group of senile 
psychotics. J. ment. Sct., 1957, 103, 
601-605. 

LassEeN, N. A., Munck, O., & Torrey, 
I. R. Mental function and cerebral 
oxygen colsumption in organic de- 
mentia. Arch. neurol. Psychiat., 1957, 
77, 126-133. 

LILJENCRANTS, J. 
ganic psychoses. 
1922, 32, 143. 

Lovett Doust, J. W., SCHNEIDER, R. A., 
TALLAND, G. A., WatsH, M. A., & 
BarKER, G. B. Studies on the physi- 
ology of awareness. The correlation be- 
tween intelligence and anoxemia in 
senile dementia. J. ner. ment. Dis., 
1953, 117, 383-398. 

McGeocna, J. A., & IrRton, A. L. The 
psychology of human learning. (2nd ed.) 
New York: Longmans Green, 1952. 

Moore, T. V. The correlation between 
memory and perception in the presence 
of diffuse cortical degeneration. Psy- 
chol. Monogr., 1919, 27, 120. 


Memory defects in or- 
Psychol. Monogr., 


Intellectual and Rorschach 
test performance of a group of senile 
dementia patients and of a group of 
elderly depressives. J. ment. Sci., 1955, 
101, 863-870. 

OrME, J. E. Non-verbal and verbal per- 
formance in nermal old age, senile de- 
mentia and elderly depression. J. 
Gerontol., 1957, 12, 408-413. 

Picnot, P. Language disturbances in 
cerebral disease. Arch. neurol. Psy- 
chiat., 1955, 74, 92-95. 

PINKERTON, P., & KeEtty, J. An at- 
tempted correlation between clinical 
and psychometric findings in senile- 


COGNITIVE DEFICIT IN ELDERLY 


60. 
61. 


62. 


64. 


00. 


~ 


~ 
+ 


. SIMMINS, CONSTANCE. 


. SPEARMAN, C. 


. TRUEBLOoD, C. K. 


PSYCHIATRIC PATIENTS 213 
arteriosclerotic dementia. J. ment. Sci., 
1952, 98, 244-255. 

Rapin, A. Psychometric trends in senility 
and psychoses of the senium. J. gen. 
Psychol., 1945, 32, 149-162. 

RAVEN, J. C. The Mill Hill Vocabulary 
Scale. London: Lewis, 1948. 

RAVEN, J. C. Guide to using the Progres- 
sive Matrices. London: Lewis, 1950. 
RoBertson, J. P. S. Age, vocabulary, 
anxiety and brain damage as factors in 
verbal learning. J. consult. Psychol., 

1957, 21, 179-182. 

Rospertson, J. P. S., & WiBBERLEY, H. 
Dementia versus mental defect in mid- 
dle-aged housewives. J. consult. Psy- 
chol., 1952, 16, 313-315. 

Rotu, M., & Hopkins, BARBARA. Psy- 
chological test performance in patients 
over 60. I. Senile psychosis and the af- 
fective disorders of old age. J. ment. 
Sci., 1953, 99, 439-450. 


. Suakow, D., Do-kart, M. B., & GoLp- 


MAN, R. The 
psychoses of the aged. 
tem, 1941, 2, 43-48. 

Suapiro, M. B., Post, F., L6FVING, 
BaRBRO, & INGLIS, J. ‘Memory func- 
tion” in psychiatric patients over sixty, 
some methodological and diagnostic 
implications. J. ment. Sci., 1956, 102, 
233-246. 


memory function in 


Dis. nerv. Sys- 


. SILVERMAN, A. J., Busse, E. W., BARNEs, 


R. H., Frost, L. L., & THALER, 

MarGareEt B. Studies on the processes 

of aging. 4. Physiologic influences on 

psychic functioning in elderly people. 

Geriairics, 1953, 8, 370-376. 

Studies in experi- 
mental psychiatry. IV. Deterioration 

- of “‘g" in psychotic patients. J. ment. 
Sct., 1933, 79, 704-734. 

The abilities of man. 

London: Macmillan, 1932. 


. STEPHENSON, W., PENTOoN, C., & KOREN- 


CHEVSKY, V. Some effects of vitamins 
B and C on senile patients. Brit. med. 
J., 1941, 2, 839-844. 

PHALER, MARGARET. Relationships 
among Wechsler, Weigl, Rorschach, 
EEG findings, and abstract-concrete 
behavior in a group of normal aged sub- 
jects. J. Gerontol., 1956, 11, 404-409. 

The deterioration of 
language in senility. Psychol. Bull., 
1935, 32, 735. (Abstract) 

VerNON, P. E., & McKINLEY, M. Effects 
of vitamin and hormone treatment on 
senile patients. J. meurol. neurosurg. 


Psychiat., 1946, 9, 87-92. 








214 JAMES 


74. WeEcHSLER, D. The measurement of adult 


intelligence. (3rd ed.) Baltimore: Wil- 
liams & Wilkins, 1944. 

. Wecus_LerR, D. A standardized memory 
scale for clinical use. J. Psychol., 1945, 
19, 87-95. 

. WELLS, F. L., & Martin, HELEN A. A. 
A method of memory examination suit- 
able for psychotic cases. Amer. J. 
Psychiat., 1923, 3, 243-257. 

. Wittrams, Moyra. Studies of perception 
in senile dementia: Cue selection as a 
function of intelligence. Brit. J. med. 
Psychol., 1956, 29, 270-279. 

. Wittiams, Moyra. Spatial disorientation 
in senile dementia. J. ment. Sci., 1956, 
102, 291-299. 

. WYLIE, MARGARET. 


An experimental 


INGLIS 


study of recognition and recall in ab- 
normal mental cases. Psychol. Monogr., 
1930, 39, 180. 

Yacorzynskl, G. Y. An evaluation of the 
postulates underlying the Babcock de- 
terioration test. Psychol. Rev., 1941, 48, 
261-267. 

. YaTEs, A. J. The use of vocabulary in 
the measurement of intellectual de- 
terioration—a review. J. ment. Sct., 
1956, 102, 409-440. 

. ZANGWILL, O. L. Clinical tests of memory 
impairment. Proc. Roy. Soc. Med., 
1943, 36, 576-580. 

ZiLBoorG, G., & HENRY, G. W. A history 
of medical psychology. New York: 
Norton, 1941. 


Received December 30, 1957. 





PSYCHOLOGICAL BULLETIN 
VoL. 55, No. 4, 1958 


RECENT STUDIES OF EYE MOVEMENTS IN READING! 
MILES A. TINKER 


University of Minnesota 


Although there has been a decrease 
in the number of experiments under- 
taken during the last 11 vears, the 
study of eye movements in reading 
continues to be an important tech- 
nique for investigating the reading 
process. An earlier article by Tinker 
(61) plus his last review (62) to ap- 
pear in this journal covers all the 
material in the field up to 1945. In 
the present review, articles appearing 
during the years from January 1945 
to October 1957 are considered. A 
few reports not available to the re- 
viewer have been included in the 
bibliography for the sake of complete- 
ness (18, 21, 31, 32). Bibliographies, 
critical evaluations, and summaries 
of parts of the field will be found in 
references (2, 5, 7, 11, 17, 22, 27, 30, 
34, 41, 44, 48, 53, 56, 61, 62, 67, 68, 
71). Certain materials on eve move- 
ments will not be reviewed or will be 
only mentioned briefly since they are 
not intimately related to oculomotor 
behavior in ordinary reading. These 
include reports on eye movements in 
viewing pictures or advertisements, 
certain of the studies on visual fixa- 
tion, and eve movements as related 
to visual attention. 

The studies to be reviewed here 
group themselves as follows: tech- 
niques of measurement, analysis of 
the reading process, training to im- 
prove eye movements, typography 
and eye movements, eye movements 
and fatigue, and summary statement. 

TECHNIQUES OF MEASUREMENT 

Most of the reports dealing with 
techniques of measurement have been 


1 The writer is grateful to the University of 
Minnesota Graduate School for a research 
grant to finance preparation of this material. 


concerned with instrumentation. 
Some authors describe modifications 
of earlier apparatus for recording eye 
movements; others set forth new 
principles of measurement. Brandt 
(6) reports the improvements in his 
bidimensional camera. It employs a 
corneal reflection technique with the 
camera lenses so arranged that the 
reflected beams of light come to a 
focus and are recorded as a tiny dot 
on the film. The 35mm film moves 
at a constant rate and stops inter- 
mittently so as to catch each new 
fixation. The camera photographs all 
eye movements in every direction on 
a single film. Analysis of the ocular 
pattern portrayed on the film yields 
the duration, location, and sequence 
of every fixation and every eye move- 
ment. The visual field to be examined 
by a subject can be any size up toa 
double-page spread of the Saturday 
Evening Post. This bidimensional 
camera provides an efficient, flexible, 
and reliable instrument for research 
in a variety of eye-movement studies 
including reading. Allen (1) also 
describes a relatively simple corneal 
reflection photographic technique for 
continuously recording vertical and 
horizontal movements on one 
35mm film. The film runs continu- 
ously in the 45° meridian. A mask 
with small slits at right angles to the 
film is interposed between the camera 
lens and film. This produces a record 
of horizontal movements from one 
eye and vertical movements from the 
other eye. The apparatus may be 
used for a variety of research projects 
besides reading. However, as recog- 


eye 


nized by the author, the use of one 
eye as an indicator of vertical move- 
ments and the other eye for hori- 


215 








216 MILES A. 


zontal movements is valid only when 
fusion, ductions, and phorias are 
normal and the visual field is close to 
a single plane. The Lifwynn eye- 
movement camera is described by 
Syz (57) and by Burrow and Syz (14). 
It is an elaborate and apparently 
expensive apparatus which photo- 
graphs the corneal reflection on 
35mm film. It is possible to expose 
frames from once to 60 or more times 
per second. Duration of fixation and 
sequence of movements are obtained 
from the records. Apparently this 
camera was designed primarily to 
study the influence of various factors 
on visual fixation. It is not readily 
applicable to reading research. 

A number of reports describe elec- 
trical methods of recording eye move- 
ments: Brockhurst and Lion (8), 
Gemelli, Colombi, and Schupfer (23), 
Lundberg (42), Marg (44), Powsner 
and Lion (49), ten Doesschate and 
Lansberg (58). Carmichael and 
Dearborn (17) have described in an 
excellent manner representative 
methods developed to record eye 
movements of human subjects while 
reading. In the chapter on recording 
of eye movements (pp. 146-205) they 
describe in detail and evaluate elec- 
trical techniques. The interested 
reader is referred to this reference. 
An electrical method has distinct 
advantages in some _ experimental 
situations. This is particularly true 
when continuous records must be 
taken for long reading periods. In 
such situations, the less restraint 
placed on the reader the better. In 
contrast to the electrical methods the 
corneal reflection method imposes 
rather severe restraint upon the 
reader by head clamps and the neces- 
sity of elaborate apparatus in close 
contact with the subject. In general, 
recent trends have been to use an 
electrical rather than a corneal reflec- 
tion method. Nevertheless, for cer- 


TINKER 


tain purposes in studying eye move- 
ments a refined photographic tech- 
nique (corneal reflection) is probably 
superior to electrical recording (see 
Tinker [62]). Hartridge and Thomson 
(30) consider that photographing the 
reflection from a scleral mirror is su- 
perior to photographing the corneal 
reflection or the edge of the iris in 
studying eye movements. However, 


any method, including this one, that 
involves an attachment to the eye- 
ball has distinct disadvantages. 


ANALYSIS OF THE READING PROCESS 


Unit of perception in reading. In 
a study by Mufioz, Odoriz, and 
Tavazza (46), the eye movements of 
children were recorded by means of a 
Grass Electro-Encephalograph while 
reading Spanish. It was found that 
the unit of recognition in reading was 
a word or group or words. And when 
the reading rate was_ increased 
through specific training, there was a 
corresponding increase in the amount 
recognized at each fixation of the 

More difficult material was 
read with more fixations of the eyes. 
It was concluded that the natural 
form of reading is not by spelling or 
syllabizing but on the basis of whole 
groups of words. That is, word forms 
or configurations constituted the 
units of perception in reading. 

As part of a larger study, Gray (28) 
compared the basic reading process 
of mature readers in 14 different 
languages by an analysis of eye- 
movement records. The languages 
were Arabic, Burmese, Chinese, Eng- 
lish, French, Hebrew, Hindi, Japa- 
nese, Korean, Navaho, Spanish, Thai, 
Urdu, and Yoruba (native Nigerian). 
There were two to seven subjects in 
each language group. The eye move- 
ments were photographed with the 
University of Chicago (corneal re- 
flection) camera. The content of the 
passages read was the same for all 


eyes. 








RECENT STUDIES OF EYE 


subjects except for the Navaho 
Indians. Records were obtained for 
both oral and silent reading. The 
data indicate that the general nature 
of the reading act is essentially the 
same among all mature readers. As 
the mature reader seeks the meaning 
of the passage, he follows along the 
lines with an alternation of short eye 
movements and pauses. At each 
fixation pause he recognizes words as 
wholes, usually two or three at a 
time, and by means of their configu- 
ration and striking characteristics. 
Occasionally regressions occur. Al- 
though there will be some variation 
due to form and structure of a lan- 
guave, the teaching of the basic skills 
in all of them can be similar to some 
degree. This is an important contri- 
bution to the study of the nature of 
the reading process. The conclusions 
are supported by other studies of 
reading Chinese and Japanese cited 
by Gray (28). A recent study of eve 
movements in reading German and 
I-nglish by Waterman (70) is also in 
agreement with Gray. The former 
found no discernible variation be- 
tween the eye-movement reading 
patterns of literate native speakers of 
these languages. Also there is no ap- 
parent change in reading habits (eye 
movements) when a native speaker of 
one language learns to read well in 
another language. 

Accuracy in visual fixation and 
movement. Certain studies of visual 
fixation, speed of eye movements, 
and vision during eye movement have 
some bearing on the study of eve 
movements in reading. Riggs, Rat- 
liff, Cornsweet, and Cornsweet (51) 
investigated the role of the disappear- 
ance of a steadily fixated visual test 
object when the image is maintained 
in a constant location on the retina 
during normal involuntary move- 
ments of the eye, and when normal 
fixation movements are effectually 


MOVEMENTS IN READING 217 


doubled. Results indicate that the 
normal eye movements during fixa- 
tion serve to overcome loss of vision 
which would result from prolonged, 
uniform stimulation. This would 
probably have a bearing in special 
reading situations which require pro- 
longed fixation as studying an intri- 
cate formula, especially when the 
eyes are fatigued. Lord and Wright 
(40) studied the eye movements oc- 
curring during fixation. They dis- 
covered rapid flicks lasting .02 to .03 
sec. with amplitudes of 2 to 14 
minutes arc. 

Saccadic eye movements, such as 
those employed in reading, appear to 
be remarkably constant in speed. 
Brockhurst and Lion (8) found no 
decrement or acceleration during 90° 
sweeps at rates of 120 to 150 times 
per minute, although near the end of 
the test the moves were less rhythmi- 
cal, or the eyes did not sweep the full 
Kleitman and Schreider 
(35) did find some diurnal variation 
for saccadic sweeps in subjects kept 
awake for 24 hours. Performance was 
poorest in the early morning. Lateral 
sweeps templeward were faster than 
nasalward. This latter finding had 
been reported by Miles in 1924 (see 
Tinker [61]). Binocular saccadic eye 
movements are ordinarily well co- 
ordinated. This has been shown by 
Lord (39) and Brandt (6). It is well 
known that there is no clear vision 
during saccadic eye movements. Bell 
and Weir (4) suggest that this inhibi- 
tion takes place in the cortex. This 
agrees with Holt (1903) but not with 
Dodge (1900) (see Tinker, [61]). 

Tinker (63) has coordinated the 
results of the Minnesota studies on 
the time relations for eye-movement 
measures. The results follow: (a) 
Characteristics of saccadic eve move- 
ments are essentially the 


distance. 


same 


whether they are interfixation move- 
ments in reading or other excursions 








218 MILES A. 


in the visual field. (6) There are sig- 
nificant individual differences in 
speed of saccadic eye movements. (c) 
Maximum velocity during eye move- 
ments of large amplitude is greater 
than that for small amplitudes. (d) 
Interfixation eye movements in read- 
ing take 10 to 23 msec. and the return 
sweep to the next line, 40 to 54 msec. 
(e) For most reading situations, eye 
movements take 6 to 8°% of reading 
time while the rest (92 to 94°) is 
devoted to pauses, the periods of 
clear vision. 

Pause duration in reading. Various 
aspects of pause duration patterns in 
reading are considered by Tinker 
(64). The first problem was to deter- 
mine minimum pause duration for 
seeing clearly during binocular vision. 
It was found that it took on the aver- 
age 172 msec. to fixate a dot at the 
end of a saccadic move, and 157 


msec. to fixate similarly and identify 
a letter. These durations are greater 


than necessary where no saccadic 
move is involved. <A tachistoscopic 
exposure of 100 msec. is adequate for 
a well cleared-up perception (see 
Tinker [61]). In reading, pause dura- 
tion is even longer and depends upon 
the content of the material read. 
Thus, for easy prose it is about 220 
msec., for scientific prose about 236 
msec., and for reading objective test 
items about 270 to 324 msec. on the 
average. In general, pause duration 
fluctuates around 250 msec. for 
adults in ordinary reading. The 
reasons for longer pause duration in 
reading than for cleared-up vision 
during fixation as described above 
are: Oculomotor adjustments of the 
eyes constitute one set of factors. 
First, we find that the mean reaction 
time of the eyes to eccentric stimula- 
tion is about 173 msec. (64). Note 
that this is about the same as the 
fixation pause on a dot at the end of 
a saccadic move (above). Secondly, 


TINKER 


the eyes converge during saccadic in- 
terfixation movements and diverge 
during the fixation which follows the 
movement. The eye must complete 
this divergence to achieve clear vision 
at the fixation and this takes time. 
Another set of factors involves the 
comprehension process. In addition 
to seeing clearly during a fixation 
pause, the reader must comprehend 
the ideas and relationships involved. 
Actually, therefore, pause duration 
includes perception time plus think- 
ing time. For instance, in reading 
without attention to meaning, pause 
durations are brief and constant 
while for reading algebraic problems 
they are long and variable. 

In reading simple material, pause 
duration is relatively constant but in 
more complex materials it can be 
highly variable. Tinker (64) shows 
that the coefficient of variability 
ranges from 10.4 to 25.8. He also 
shows that reliability of pause dura- 
tion is high, i.e., .82 to .85. 

Pause duration also varies with 
changes in the typographical arrange- 
ment. This will be discussed below. 

In the same report, Tinker showed 
that pause duration alone has poor 
validity as a measure of reading pro- 
ficiency. However, when combined 
with fixation frequency to produce 
perception time, the latter is fairly 
valid as a measure of speed of reading 
proficiency. 

Reaction time of the eye. As noted 
above, reaction time of the eye to ec- 
centric stimulation is involved in 
reading movements. Tinker (64) re- 
ports an average reaction time of 173 
msec. As early as 1908 Diefendorf 
and Dodge, cited by Walton (69), 
reported reaction saccadic move- 
ments to a stimulus appearing in in- 
direct vision to be 195 msec. on the 
average with a range of 120 to 235 
msec. Westheimer (71) reports reac- 
tion times of the eye to eccentric 





RECENT STUDIES OF EYE | 


stimulation by neon bulbs to be 120 
to 180 msec. Gerathewohl and Strug- 
hold (24) report a reaction time of 
250 msec. Walton (69) devised a 
clever experiment to duplicate eye 
movements and fixations as utilized 
in a reading task while eliminating 
the factor of comprehension. This 
provided a means of determining the 
reaction time per fixation. Lines of 
five three-letter words, with one-half 
inch between words, were presented 
to 10 subjects before an eye-move- 
ment (Ophthalm-O-Graph). Records 
were obtained as the reader quickly 
fixated each of the successive words. 
Mean reaction time of the eye varied 
from 170 to 309 msec. with a group 
average of 219 msec. From these 
data on reaction time, Walton com- 
puted the maximum reading rates pos- 
sible in terms of limiting physiologi- 
cal and anatomical factors. With a 
reaction time of 200 msec. and 3 or 2 
fixations on a ten-word line, the rate 
would be 853 or 1250 words per 
minute (W.P.M.). And with a reac- 
tion time of 167 msec. and 3 or 2 
fixations per line, the rate would be 
982 or 1451 W.P.M. He concludes 
that for rates over 1451 W.P.M., the 
reader is skimming. These figures 
appear to be overoptimistic since no 
allowance is made for comprehension 
time which must enter into pause 
duration time in ordinary reading. 
No published reports cite average 
reading pauses as low as 200 msec. to 
say nothing of 167 msec. Thus 
Dixon’s Case 13 (19), who had the 
reputation of reading whole lines or 
paragraphs at a glance, only read 
about 500 W.P.M. and used three or 
more fixations per line. The reviewer 
suggests that any rate of over about 
800 W.P.M. can only mean that the 
reader is skimming rather than read- 
ing all the material. Even with easy 
material, 500 W.P.M. is very fast 
reading. Only the exceptional indi- 


MOVEMENTS IN READING 219 


vidual could achieve around 800 
W.P.M. Any claims by advertised 
reading improvement programs that 
one can learn to read at 1500 to 3000 
or more W.P.M. can only be false. 

Rhythm reading. When readers 
tend to employ approximately the 
same eye-movement pattern from 
line to line they are said to employ a 
rhythmical pattern of eye move- 
ments. Dixon (19) found evidence of 
rhythmical reading in only a few of 
the records of his good readers. See 
Tinker (62, p. 98) for an evaluation 
of this concept of rhythm reading. 
According to recent studies the view 
that rhythmical eye-movement pat- 
terns are desirable for effective read- 
ing becomes meaningless. Training 
designed to produce rhythmic pat- 
terns of eye movements in order to 
improve reading, therefore, is based 
upon an invalid assumption. 

Vertical vs. horizontal reading. 
Studies of reading Chinese and Japa- 
nese in vertical and horizontal align- 
ment have been considered in previ- 
ous reviews (61, 62). In Tinker’s 
study (65) of reading English mate- 
rial in vertical and horizontal ar- 
rangements, 10 college subjects were 
given practice during six weeks in 
reading material in a vertical arrange- 
ment (42 readings of 300-word selec- 
tions). Prior to this practice, eye 
movements were photographed while 
reading comparable selections in both 
the horizontal and vertical arrange- 
ments. The same was done after the 
six weeks of practice. At the initial 
testing, the vertical reading was 50% 
slower than the horizontal, but after 
practice it was only 21.8% slower. 
On the initial test, vertical reading 
required fewer fixations, fewer re- 
gressions, and longer pause durations. 
Practice produced marked improve- 
ment (reduction) in fixations and re- 
gressions. Apparently long estab- 


lished reading habits produced the 








220 


superiority of the horizontal reading. 
With longer practice, however, it is 
likely that the vertical reading would 
equal or excel the horizontal. It is 
noteworthy that more words were 
read per fixation and fewer regres- 
sions occurred in the vertical arrange- 
ment even before practice. And as 
pointed out by Shen (see Tinker, 61), 
vertical eye movements may be bet- 
ter adapted than horizontal move- 
ments to the short interfixation 
movements used in reading. 
Oculomotor efficiency and reading. 
Gilbert (25) has attempted to study 
the relation of growth in simple ocu- 
lomotor control in relation to growth 
of eye movement patterns in reading. 
Growth in oculomotor control was 
recorded by photographing eye move- 
ments while reading series of digits 
and growth in eye-movement pat- 
terns while reading simple prose. 
Pupils in Grades I through IX were 
used as subjects. The number in each 


grade ranged from 18 to 65 for prose 
reading; from 29 to 71 for digit read- 
ing. A group of 42 college students 


was also used. Records were ob- 
tained for reading 100 words and 100 
digits. 

It is claimed in the discussion that 
skill in directing the eye movements 
in reading digits (called simple motor 
activity) is substantially related to 
oculomotor performance in reading 
connected (simple) material. Details 
of the findings and conclusions need 
not be reported here because it seems 
to the reviewer that the author has 
employed an invalid criterion of co- 
ordinated motor activity of the eyes. 
Digit reading is classified as simple 
motor activity of the eyes. But read- 
ing digits is reading. Pupils were 
shown how to read the digits prior to 
photographing their eye movements. 
After all, digits are symbols for words. 
Except for space required for print- 


MILES A. 


TINKER 


ing, 9is the same as mine. The series 
of digits were read, not just fixated. 
The prose samples must have been 
extremely easy reading fér all except 
the first and perhaps the second 
grade pupils. Eye movements in 
reading such material, therefore, 
should be little influenced by compre- 
hension factors. So there is little sur- 
prise that oculomotor patterns for 
reading the digit series and the easy 
prose turned out to be similar. Un- 
doubtedly they are similar reading 
situations. 

A number of other specific conclu- 
sions are at fault. Following are a 
few: (a) Eye-movement patterns do 
not reflect efficiency of central proc- 
esses of comprehension. There is 
plenty of contrary evidence in the 
literature. If the author had pre- 
sented both easy and very difficult 
materials to his readers he would 
have obtained evidence that difficul- 
ties of comprehension are reflected in 
eye movements. (+) Eye-movement 
records do not predict reading test 
performance. They would have if 
taken while reading the test material 
(see Tinker [62, p. 95]). (c) Poor ocu- 
lomotor coordination is a handicap 
in learning to read well. Since no 
satisfactory measure of this coordina- 
tion was used, the conclusion does not 
follow (see Tinker [62, pp. 110—114}]). 

The author has made a good con- 
tribution to development of eye- 
movements with age. This will be 
considered below. 

Eye-movement changes with age. In 
the study discussed just above, Gil- 
bert (25) obtained eye-movement 
measures for fixation frequency, re- 
gressions, and pause duration during 
reading for Grades I through IX and 
for college students. The same prose 
selection (easy) was used at all levels 
except college. Proficiency in all 
measures improved throughout the 








educational levels although most of 
the gains were achieved by the fifth 
grade. For good readers the most 
rap d growth is during the first four 
grades. Poorer readers continue to 
gain throughout most of the grades. 
Oculomotor behavior of college stu- 
dents was only slightly more mature 
than for ninth-grade pupils. Growth 
of eye-movement patterns in reading 
series of digits showed similar pat- 
terns. Much the same growth pat- 
terns in eve movements were dis- 
covered when the same pupils were 
photographed yearly for three suc- 
cessive years in Grades II through IV 
and IV through VI. Thus, records 
from samples at various grade levels 
apparently yield valid measures of 
oculomotor growth. 

Ballantine (3) photographed eye 
movements for readers in Grades II, 
IV, VI, VIII, X, and XII to obtain 
growth curves for the various eye 
movement measures (fixations, re- 
gressions per em, regressions per line). 
All subjects (20 in each grade) read 
an easy selection (second-grade dif- 
ficulty) and a selection appropriate 
to their own grade in difficulty. The 
experiment was carefully controlled 
in all respects. Fixation and regres- 
sion frequency improved rapidly 
from Grades II to IV, slower from IV 
to VIII, and only slightly or not at 
all from VIII to XII. Refixations per 
line decreased steadily to Grade 
VIII. These changes tended to be sig- 
nificant to Grade VIII but not at 
higher levels. There were no im- 
portant differences in the measures 
between the easy selection and the at- 
grade selections. The author's con- 
clusion that this indicates that diff- 
culty of material does not affect eye 
movements is not valid. The difficult 
selections were at grade for the pupils. 
The results only signify that the eye 
movements are approximately the 


RECENT STUDIES OF EYE MOVEMENTS IN READING 


221 


same for reading material at grade 
and material that is easier than at 
grade. 

Morse (45) investigated the eye 
movements of fifth- and seventh- 
grade pupils when they read material 
at grade (in difficulty), at two grades 
below, and at two grades above their 
school placement. There were 54 
readers in each grade. The reading 
selections were carefully equated to 
the proper level. The eye-movement 
patterns of the seventh-grade pupils 
were more efficient than those of fifth- 
grade on both seventh- and fifth- 
grade material. And seventh-grade 
pupils were more efficient in reading 
fifth-grade material than fifth grad- 
ers reading third-grade material. 
Eye-movement patterns for both 
fifth- and seventh-grade pupils did 
not change in any predictable way 
with increase in difficulty of material 
by two grades. This finding is indeed 
unfortunate. It indicates that the 
school children used as subjects had 
not been successfully taught to vary 
their pace according to the difficulty 
of the material, i.e., they were not 
flexible in adjusting reading pro- 
cedure to difficulty of material. It is 
unlikely that the purpose for the 
reading was so rigidly set that it 
would have prevented flexibility in 
adjusting to difficulty of material. 
When eye movements do not vary 
with difficulty of reading 
pupils are immature readers. 

As part of an extensive investiga- 
tion, Dunn (20) compared the eye 
movements of retarded boys with 
those of normal boys of the same 
mental age. 
cant 


matter, 


There were no signifi- 
differences between the two 
groups in fixation frequency, regres- 
sions, and rate of reading. The com- 
prehension of retarded 


boys was 


poorer than for the normal boys. The 
former apparently merely looked at 








222 MILES A. 


words without 
content. 

Effect of type of material and set. In 
Dixon’s study (19), a group of pro- 
fessors and a group of graduate stu- 
dents in physics, in history, and in 
education read passages in their own 
and in each of the other two areas. 
The passages were equated for diffi- 
culty by standard formulas. There 
were 16 subjects in each subgroup. 
Eye movements were photographed 
while the passages were read. In con- 
sidering the results, one should keep 
in mind that the subjects were highly 
selected experts. The conclusions 
cannot be applied to the reading of 
school children. Professors, and to a 
lesser degree the graduate students, 
tended to read material in their own 
field with more efficient oculomotor 
patterns, i.e., familiarity of material 
is a factor in determining reading per- 
formance. Different types of ma- 
terial read with the same directions 
(set) do not automatically elicit dif- 
ferent patterns of eye movements 
when the passages are equally difficult. 
These findings are important. The 
above findings do not mean, how- 
ever, that reading for different pur- 
poses or reading materials with wide 
variations in difficulty would not pro- 
duce variation in oculomotor behav- 
ior. Care, therefore, must be exer- 
cised in interpreting the findings of 
this study. Ordinarily the material 
that is read in physics or mathematics 
at any grade level is not of the same 
difficulty as that read in the social 
studies or literature in the same 
grade. Furthermore, the purposes 
for which the reading is done tends to 
differ from one subject matter to an- 
other. 

Ledbetter 


understanding the 


(37) the 


investigated 
variation in eye-movement patterns 
of 60 eleventh-grade pupils while 
reading selections in English, mathe- 


TINKER 


matics, natural science, and social 
science. An attempt was made to 
equate the difficulty of the passages 
read in terms of number of words, 
vocabulary difficulty, sentence length, 
and grammatical structure. No 
standardized formula was used for 
this. Significant differences in ocu- 
lomotor patterns were found for read- 
ing in the various subject matter 
fields. The eye movements in reading 
a poem and mathematics were more 
complex than in the other areas. The 
author admits that the method of 
equating materials for difficulty may 
not have been valid. Other earlier 
studies such as those of Seibert and of 
Stone (see Tinker [62]) found differ- 
ences in eye-movement measures 
while reading different subject mat- 
ter. Examination of all the literature 
suggests that variation in eye move- 
ments occurring with variation in 
subject matter is due largely to dif- 
ferences in difficulty of the material 
or to changes in the purpose for which 
the reading is done rather than to dif- 
ferences in subject matter as such. 
For instance, in Dixon's study cited 
above, the more efficient reading of 
materials that are familiar to the sub- 
ject may only mean that the familiar 
material easier even though 
equated to other materials for diff- 
culty by a formula. 

In an unpublished study by G. R. 
Klare, E. H. Shuford, and W. H. 
Nichols? the eye movements of 30 
college students were photographed 
while reading technical material dif- 
fering in style difficulty first with a 
weaker set to learn and also with a 
stronger set to learn. The easier style 
was read with significantly fewer fixa- 
tions and regressions. The stronger 
set to learn resulted in somewhat 


Was 


2 Communicated to the writer in mimeo- 


graph form by G. R. Klare. 








more fixations and regressions. Thus 
style difficulty is inversely related to 
reading efficiency as measured by eye 
movements. 

Flexibility in adjusting eye move- 
ments. It is generally recognized that 
the mature reader is the versatile 
reader. He will change his pace (re- 
flected in eye movements) to fit the 
purpose of reading and the nature 
and the difficulty of the material. He 
will read rapidly when that is appro- 
priate. In certain other situations 
he will employ slow, analytical read- 
ing. Many readers have not achieved 
this versatility (see the Morse study 
reviewed above). Laycock (36) in- 
vestigated what happens to oculo- 
motor behavior of good readers when 
they try to read simple material at a 
very rapid rate. A sample of 72 was 
drawn from 391 college readers: 37 
flexible and 35 inflexible. The flexible 
readers were able to read quickly 
when directed to and maintain ade- 
quate comprehension; the inflexible 
ones could not change their pace even 
when directed to do so. Eye move- 
ments of these two groups were pho- 
tographed when reading at normal 
rate and when directed to read 
quickly. Analysis of the eye-move- 
ment measures showed that the flex- 
ible in comparison to the inflexible 
group had significantly fewer fixa- 
tions. and shorter pause duration 
when directed to read quickly. There 
Was no significant change in regres- 
sions. The inflexible group did make 
some improvement but far less than 
that made by the flexible readers. So 
readers tend to be more or less flexible 
and inflexible rather than strictly 
flexible and inflexible. It is suggested 
that emphasis be placed upon helping 
the less flexible readers in reader im- 
provement programs since they tend 
to do all their reading at about the 
same rate. Lack of flexibility in read- 


RECENT STUDIES OF EYE |! 


MOVEMENTS IN READING 2 


Nm 


3 


ing at the college level constitutes a 
serious handicap. 

Individual differences. Wide in- 
dividual differences are always found 
in reading. These are readily de- 
tected in eye-movement records. 
Special emphasis is placed upon these 
variations by such authors as Morse 
(45), Dixon (19), Ballantine (3), and 
Gilbert (25). 

Eye movements in special reading 
situations. Brandt (6) cites data on 
eye movements while reading in sev- 
eral special reading situations: (a) 
Geometry. In comparing good and 
poor geometry students at the end of 
two semesters’ study, the 
dents made about three times as 
many fixations as the good students 
per correct answer to problems. The 
many random eye movements of the 
inferior students indicated lack of 
systematic attack on problems. (0) 
Algebra. Here also the inferior stu- 
dents made more fixations and exhib- 
ited random moves for re-examina- 
tion of the problems. (c) Arithmetic. 
In attempting to find an error in a 
long division problem, superior stu- 
dents use more systematic patterns 
of eye movements. (d) Spelling. In 
attempting to identify correctly 
spelled words in a multiple-choice 
test, an inefficient (and low ability) 
pupil employed inadequate sequences 
of moves. Even the good student 
used many random and haphazard 
eye movements. (e) Geography. In 
reading a map, the efficient pupil em- 
ploys relatively few systematic fixa- 
tions and eye movements. (f) /ntelli- 
gence tests. Low ability pupils make 


poor stu- 


more fixations and more ineffective 
moves than high ability pupils. In 
these specific learning 
situations, the efficient pupil tends to 
employ effective oculomotor patterns 
while the inefficient pupil uses an ex- 
cessive number of fixations and a hap- 


general, in 








224 


hazard pattern of movements. These 
conclusions must be tentative since 
relatively few subjects were measured 
by Brandt. 

Lofquist (38) studied the 
movement patterns in reading cleri- 
cal test items which consist of pairs 
of names and pairs of numbers to be 
compared. The eye movements of 40 
university students were  photo- 
graphed while they read 16 lines of 
prose, 20 name and 20 number items 
of the Minnesota Vocational Test for 
Clerical Workers. Response to the 
test required detection of likeness or 
difference of the two parts in an item. 
Reading the test items required a 
more analytical procedure than prose, 
i.e., more fixations and more regres- 
sions. The number items were read 
with more fixations and regressions 
and longer perception time than the 
name items. A similar trend was 
found for long (more complex and 
difficult) vs. short items. The results 


eye- 


of this study furnish confirmation of 
earlier investigations (61, 62) which 
have demonstrated the adaptability 
of eye movements to central percep- 


tual processes. It is now well estab- 
lished that oculomotor reactions are 
exceedingly flexible and quickly re- 
flect any variation in the central proc- 
esses of perception, judgment, com- 
prehension, etc. In other words, it 
appears that eye-movement patterns 
merely reflect ease or difficulty of 
reading, efficient or poor reading per- 
formance, and degree of comprehen- 
sion, rather than cause good or poor 
reading. Certain writers, on the basis 
of incomplete evidence, wrongly infer 
that eye-movement patterns are sta- 
ble and unmodifiable (45) or are 
limited and fixed by some underlying 
motor ability (25). Versatility in ad- 
justing reading habits (including eye 
movements) to variation in purposes 
and materials is one “hallmark” of 
maturity in reading. 


MILES A. 


TINKER 


TRAINING TO IMPROVE EYE 
MOVEMENTS 

Various kinds of training have 
been employed in attempts to im- 
prove eye movements in reading. It 
is assumed by some authors that if a 
person’s oculomotor patterns are de- 
veloped to be similar to those which 
characterize efficient reading, his 
reading proficiency would improve. 
The earlier studies in this area have 
been evaluated by Tinker (62). The 
more recent investigations will be 
briefly noted. Glock (26) studied the 
effect upon eye movements of three 
methods of training: (a) using the 
Harvard films which expose phrases 
in succession and thus train eye 
movements; (5) using a new film 
which exposed two successive lines 
simultaneously; and (c) reading 
printed material while motivated to 
read fast and comprehend. Four 
weeks training was given to six sec- 
tions of college students. Students 
made significant improvement in eye 
movements (fixations, regressions, 
pause duration) under all three meth- 
ods of training. But there were no 
significant differences between the re- 
sults of the three methods, i.e., the 
technique that paced the eyes was no 
more effective than the others in 
modifying eye movements. 

In a carefully controlled experi- 
ment, Manolakes (43) checked the in- 
fluence of omitting tachistoscopic 
training on changes in eye move- 
ments. Experimental and control 
groups received training on a reading 
rate controller plus vocabulary and 
comprehension training. The control 
group also had tachistoscopic train- 
ing; the experimental group did not 
but received somewhat broader train- 
ing in vocabulary and comprehen- 
sion. End tests revealed no signifi- 
cant differences between groups in re- 
duction of fixations, regressions, and 
pause duration. Therefore, the read- 








RECENT STUDIES OF EYE 
ers were not penalized by omission 
of tachistoscopic training. This sub- 
stantiates previous findings that ta- 
chistoscopic training tends to be of 
doubtful value in the reading im- 
provement program. 

In Tillson’s experiment (59), col- 
lege students in a semester reading 
course were trained with a reading 
accelerator, the Harvard films, and 
timed reading. Eye-movement rec- 
ords taken near the beginning and 
near the end of the course revealed a 
significant reduction in fixation fre- 
quency and regressions. There were 
also significant gains in speed and 
comprehension. It was concluded 
that reading proficiency im- 
proved by changing reading (oculo- 
motor) patterns. The conclusion 
should be that the improved reading 
was reflected in a changed oculo- 
motor pattern. Like so many au- 
thors, Tillson has assumed wrongly 
that proficient eye movements pro- 
duce efficient reading rather than vice 
versa, ‘ 

Westover (72) investigated the 
comparative effectiveness of three 
methods of improving the reading 
performance of college freshmen: (a) 
Group I had two fifty-minute prac- 
tice periods per week for five weeks in 
reading and taking tests on study 
tvpe of reading exercises; () Group 
Il had had the same practice on the 
same material by means of a device 
for controlling eye movements; (c) 
Group III attended college but re- 
ceived no special instruction in read- 
ing. All groups made significant gains 
in speed and comprehension but 
Groups I and Il made greater gains. 
There was, however, no difference in 
the gains for Groups I and II. There- 
fore, the mechanical control of eye 
movements did not show significantly 
better results than the same reading 
exercises used alone. 

The evaluation of 


was 


present 


eye- 


MOVEMENTS IN READING 


225 


movement training or training to 
provide improved eye movements in 
reading would be the same as that 
given by Tinker (62, p. 112) in 1946. 
The improvement obtained by such 
training, with or without elaborate 
apparatus, is no greater than that re- 
sulting from well-motivated reading 
practice alone. See Tinker (62) for 
details. 
TYPOGRAPHY AND EYE MOVEMENTS 
Although Burt, Cooper, and Mar- 
tin (15) considered eye movements in 
studying the effect of typographical 
variations on reading, no important 
results are cited. Hackman and 
Tinker (29) made an analysis of ocu- 
lomotor patterns employed during 
the reading of material printed in a 
variety of combinations of colored ink 
and colored paper. A latin square 
experimental design and 49 readers 
were used. Each of the eye-move- 
ment measures (perception time, fixa- 
tion frequency, pause duration, and 
regression frequency) showed varia- 
tions from one color combination to 
another in such a way as to indicate 
print of good and of poor legibility. 
Black (ink) on yellow (paper), red 
on white, green on red, and black on 
white provided best legibility. Black 
on purple, pale orange on white, and 
red on green provided worst legi- 
bility. The results indicate that 
brightness contrast between ink and 
paper rather than color variation as 
such determines legibility of the 
print. In using combinations of col- 
ored ink and paper, therefore, maxi- 
mum legibility is achieved by using a 
printing arrangement with a maxi- 
mum brightness contrast between 
print and background. It might be 
mentioned that these findings are in 
harmony with a large body of data 
derived from experiments on bright- 
ness contrast in relation to visibility 
of print and speed of reading. 








226 MILES A. 


In the reading situation the 
amount of material perceived at each 
fixational pause is known as the per- 
ceptual or fixation span. Paterson 
and Tinker (47) studied the effect of 
typographical variations upon the 
perceptual span in reading. The fol- 
lowing nonoptimal typography re- 
duced significantly the perceptual 
span: Old English type face, 6 point 
and 14 point type, 9 and 43 pica line 
widths, a low brightness contrast be- 
tween print and paper, and combina- 
tions of the above. In general, opti- 
mal typography favors a large per- 
ceptual span and nonoptimal typog- 
raphy reduces significantly the span. 
It should be noted that other factors 
such as comprehension requirements 
may afiect the perceptual span more 
than typographical changes. 

An extensive study of the effect of 
typographical variations upon eye 
movements in reading was completed 
by Tinker and Paterson (66). Eye 


movements were photographed while 


reading optimal and  nonoptimal 
printing arrangements. The typo- 
graphical variations studied included 
line width, size of type, type faces, 
type form, white vs. black print, 
brightness contrast between print 
and paper, and combinations of the 
above. The oculomotor measures 
were fixation frequency, words per 
fixation, regression frequency, pause 
duration, and perception time. The 
oculomotor patterns varied {rom one 
typographical arrangement to an- 
other. Examination of these patterns 
revealed the nature of the perceptual 
difficulties involved in each nonopti- 
mal printing arrangement. For ex- 
ample, in contrast with lower case 
print, all capital printing was read 
with more fixations, a decrease in 
pause duration, an increase in percep- 
tion time, and no change in regression 
frequency. The characteristic word 


TINKER 


forms which facilitate rapid reading 
in lower case printing are absent in 
all capitals. For details of the other 
comparisons, see the original report 
(66). 

EYE MOVEMENTS AND FATIGUE 

The results in various early experi- 
ments (see Tinker [62]) suggest that 
efficiency of oculomotor behavior 
may be affected by conditions which 
are intended to produce fatigue. 
Hoffman (33) had 30 college subjects 
read continuously for four hours 
while eye movements were recorded 
as electro-oculargrams. Nine five- 
minute samples of the reading were 
recorded: at the beginning and during 
the last five minutes of each 30 min- 
utes of reading during the four hours. 
The number of fixations (for five min- 
utes) and the number of lines read de- 
creased significantly after the first 
half hour reading. However, 
length of reading period had rela- 
tively little influence on number of 
fixations per line, 1.e., no significant 
increase until the very end of the four 
hours where there was an increased 
0.430 of a fixation per line on the 
average. Apparently the author has 
misinterpreted his data. A decrease 
in fixations ordinarily means more 
efficient reading and an_ increase 
means less efficient reading. Since 
there was a decrease in lines read at 
successive half hours and since there 
were just about the same number of 
fixations per line at all times, there 
would necessarily be less fixations in 
five minutes of reading. So the decre- 
ment in total fixations per five min- 
utes is an artifact. There were no sig- 
nificant changes in eye fixations per 
words read or per line except at the 
very last period measured in compari- 
son to the first five minutes of read- 
ing. Furthermore, the decrement in 
number of lines read on successive 


of 








RECENT STUDIES OF EYE 


measurements may only mean a less- 
ening of motivation as the author 
recognizes. Similarly regressions per 
line show no significant change until 
at the end of four hours of reading 
when there was an increase of 0.175 
per line on the average. Considering 
the influence of long periods of read- 
ing on eve movements, the only sig- 
nificant finding in this study is that 
fixations per line and regressions per 
line increased significantly after four 
hours of reading. This suggests that 
fatigue was beginning to operate by 
the end of four hours of reading, but 
experimental confirmation is needed. 

Carmichael and Dearborn (17) set 
themselves the task of determining 
how long a normal human subject can 
continue to read before there are sig- 
nificant changes in his reading behav- 
ior. Twenty subjects read an inter- 
esting historical novel for six hours 
continuously and for another 
hours-an economic treatise. 


Six 
Another 
20 subjects read the same materials 
reproduced on microfilm (and pro- 
Short 
comprehension tests were interspersed 
during the reading to maintain moti- 
vation and a long test was given at 
the end of the six hours. 


jected) for stmilar periods. 


Eve move- 
ments were recorded electrically in 
the form of an electro-oculogram. 
Five-minute records were obtained 
at the beginning and during the last 
five minutes of each half hour. Ocu- 
lomotor measures considered were 
fixations per five-minute period, fixa- 
tions per line, fixation sigma score, 
regressions per five-minute period, 
and regressions per line. The results, 
considering all oculomotor measures, 
indicate that subjects read as well at 
the end of six hours as at the begin- 
ning, i.e., the oculomotor patterns did 
not change significantly. This was 
true for reading both the books in reg- 
ular print and in microfilm reproduc- 


MOVEMENTS IN READING 


227 


It is suggested that the task of 
reading for six hours is such that for 
normal subjects a new ‘‘steady rate”’ 
of some sort may be established dur- 
ing continuous work by the visual re- 
ceptor-neuromuscular mechanism. 
Except for a few reports of mild dis- 
comfort, it was not found that read- 
ing for the six hours was done at any 
“cost’’ to the organism. The practi- 
cal educational inference that seems 
justified is that there is no basis for 
the belief that requiring long periods 
of reading by high school and college 
students may be injurious to the vis- 
ual mechanisms of suc students if 
their eyes are in fair condition to start 
with. One should keep in mind that 
the reading in this experiment was 
done under optimal or near optimal 
conditions of print and lighting. Fur- 
thermore, the authors do not deny 
that significant deterioration might 
occur under more stressful conditions. 
Brozek (9), in commenting on the 
findings, questions the author’s inter- 
pretations and at the same time 
states that in a similar study Hoff- 
man (33) obtained definite symptoms 
of deterioration within four hours. 
Actually, as stated above, Hoffman 
only found a suggestion of deterioria- 
tion at the very end of the four hours. 
Also the reader should remember that 
Carmichael and Dearborn limit fa- 
tigue as they use it to a descriptive 
term. Actually, in terms of this defi- 
nition, therefore, and in view of the 
purpose of the experiment (see above), 
as well as in terms of practical impli- 
cations (see above), Carmichael and 
Dearborn have made a very im- 
portant contribution. 

In quantitative cri- 


tion. 


considering 


teria of oculomotor performance in 
relation to fatigue, Brozek (10) notes 
that in the course of severe stress of 
making saccadic eye moves as fast as 
possible for four minutes through an 








228 MILES A. 


arc of about 14° there was significant 
deterioration in fixation time, veloc- 
ity of movements, and extent of cor- 
rective movements. After two hours 
of severe visual work, there was simi- 
lar deterioration. Brozek, Simonson, 
and Keys (13) cite similar trends. 
Simonson and Brozek (54) photo- 
graphed saccadic eye movements be- 
fore and after two hours of severe 
visual discrimination under 5, 100, 
and 300 foot-candles of illumination. 
There was deterioration in all oculo- 
motor measures but only significantly 
so for part of the measures. In an- 
other study Simonson and Brozek 
(55) were concerned with the effect of 
spectral quality of light on visual per- 
formance and fatigue. One aspect 
dealt with oculomotor behavior be- 
fore and at the end of two hours of 
severe visual work under five foot- 
candles of light. Although there was 
deterioration for most oculomotor 
measures, only two were significantly 


different. Velocity and average devi- 
ation of eye movements from the tar- 


get were significantly greater for 
work under illumination from nat- 
ural white lamps than for work under 
illumination from ordinary inside 
frosted lamps. The third type of 
lamp was the Verd-A-Ray. Later 
Brozek and Simonson (12) present a 
more detailed analysis of data com- 
paring these three sources of illumi- 
nation. No significant differences in 
oculomotor behavior were found after 
the two hours of severe visual work. 

Eye movements of normal and ex- 
ophoric (eyes tend to turn out from 
desired position) readers were photo- 
graphed by Salatini (52). At the be- 
ginning of lines of print, i.e., at the 
end of the return sweep, the exo- 
phoric group made somewhat greater 
divergent movements. It is suggested 
that this might well account for the 
greater fatigue from reading of this 


group. 


TINKER 


Evaluation of data on oculomotor 
behavior as a measure of fatigue (ir- 
respective of how fatigue is defined) 
would seem to indicate the following: 

1. In the ordinary reading situa- 
tion with motivated readers at high 
school and college level, working 
under optimal conditions of print and 
light, oculomotor behavior does not 
deteriorate while reading continu- 
ously as long as six hours. There may 
be slight signs of fatigue after four 
hours of reading where motivation is 
not controlled. 

2. Various aspects of saccadic eye 
movements (rate, accuracy of fixa- 
tion at end of excursion, etc.) tend to 
show deterioration when under stress 
such as after two hours of strenuous 
visual inspection. 


SUMMARY STATEMENT 


Studies of eye movements in read- 
ing are diminishing in number during 
recent years. There are approxi- 
mately 40% fewer studies during the 
past 10 years than in the previous 
decade. Recent studies reveal a 
lively interest in devising new or 
modified techniques for recording 
eye movements. There is a general 
tendency to move in the direction of 
some type of electrical recording to 
obtain electro-oculograms or some- 
thing comparable. Photographing 
the corneal reflection is employed 
much less than formerly because it is 
less flexible than many of the newer 
techniques. A large amount of recent 
work has been concerned with the 
study of visual fixation, speed of eye 
movements, reaction time of the eye, 
oculomotor efficiency, and vision dur- 
ing eye movements. 

Several experiments deal with the 
nature of the reading process, matu- 
ration of oculomotor patterns, and | 
the influence of subject matter on eye | 
movements. An appreciable number | 
of investigations deal with the effects | 








RECENT STUDIES OF EYE 


of visual fatigue on eve movements 
and with the influence of typographi- 
cal variations on oculomotor pat- 
terns. There are a few studies of eye 
movements in special reading situa- 
tions. It appears that some writers 
still adhere to the mistaken notion 
that training eve movements as such 
is an effective way to improve read- 
ing. Numerous nonexperimental ar- 
ticles are concerned with general dis- 
cussions and evaluations of eye move- 
ments in reading. It would appear 
from a survey of the literature that 
the study of eye movements in read- 
ing is to some degree reaching the 
stage of diminishing returns. Rela- 
tively few of the recent experiments 
deal with fundamental problems. 
Too many investigators appear to be 
unfamiliar with the literature in the 
field. In several instances findings are 


MOVEMENTS IN READING 


229 


reported as new when in fact they are 
the same as discovered and reported 
earlier. The future study of eye 
movements in reading does not ap- 
pear to be too promising. The eye- 
movement approach to the study of 
the reading process has_ probably 
made its major contributions. What 
we now need is less activity by dilet- 
tantes who are inadequately prepared 
to see the fundamental problems and 
unable to design suitable experiments 
in the field. Undoubtedly there will 
be a few researchers thoroughly 
grounded in the field who will sense 
basic problems and organize sound 
experimental procedures of investiga- 
tion. Such individuals will be the 
ones to contribute worthwhile new 
information concerning eye move- 
ments in reading. 


REFERENCES 


1. Atten, M. J. A simple photographic 
method for continuously recording 
vertical and horizontal eye movements. 
Amer. J. Optom., 1955, 32, 88-93. 

2. Anperson, I. H., & Morse, W. C. The 
place of instrumentation in the reading 
program I. Evaluation of the Oph- 
thalm-O-Graph. J. exper. Educ., 1946, 
14, 256-262. 

3. BALLANTINE, F. A. Age changes in meas- 

ures of eye-movements in silent reading. 

In Studies in the psychology of reading. 

University of Michigan Monographs in 

Education, No. 4. Ann Arbor: Univer. 

of Michigan Press, 1951, 65-111. 

ELL, G. H., & Werr, J. B. DE V. Vision 

during glancing movements of the 

eyes. Trans. Ophthal. Soc. U.K., 1947, 

67, 221-228. 

Branpt, H. F. Ocular photography as a 
scientific approach to the study of the 
psychological aspects of seeing. 
Engng, N. Y., 1944, 39, 279-289. 

6. Branot, H. F. 


1. B 


tn 


Illum., 


The psychology of seeing. 


New York: Philosophical Library, 
1945. 

7. BRaTBAK, J. Oyebevegelesene ved 
lesning. [Movements of the eyes in 


reading.] Norsk ped. Tidskr., 1943, 27, 
59-163. 
8. Brockuurst, R. J., & Lion, K.S. Analy- 


sis of ocular movements by means of an 
electrical method. A.M.A. Arch. 
Ophthal., 1951, 46, 311-314. 

9. Brozek, J. Visual fatigue: a critical com- 
ment. Amer. J. Psychol., 1948, 61, 420- 


424. 
10. Brozek, J. Quantitative criteria of oculo- 
motor performance and fatigue. J. 


appl. Physvol., 1949, 2, 247-200. 

11. Brozek, J. Quantitative analysis of vol- 
untary eye movements. In Gerard, 
R. W., Methods in Medical Research, 
Vol. 3. Chicago Yearbook Publisher, 
1950, 199-207. 

12. Brozex, J., & Stmonson, E. Visual per- 
formance and fatigue underconditions of 
varied illumination. Amer. J. Ophthal., 
1952, 35, 33-46. 

13. BrozeK, J., Srmonson, E., & Keys, A. 
Changes in performance and in ocular 
functions resulting from strenuous 
visual inspection. Amer. J. Psychol., 
1950, 63, 51--66. 

14. Burrow, T., & Syz, H. Studies with the 
Lifwynn eye-movement camera. J. Biol. 
Photogr. Ass., 1949, 17, 155-170. 

15. Burt, C., Cooper, W. F., & Martin, 
J. L. A psychological study of typogra- 
phy. Brit. J. Stat. Psychol., 1955, 8(1), 
29-57. 


16. CARMICHAEL, L. 


Reading and visual 








MILES A. 


work: a contribution to the technique 

of experimentation on human fatigue. 

Trans. N. Y. Acad. Sci., 1951, 14, 94— 

96. 

. CARMICHAEL, L., & DEARBORN, W. F. 

Reading and visual fatigue. Boston: 

Houghton Mifflin, 1947. 

. Cornsweet, T. N. A determination of 
the stimuli for involuntary drifts and 
saccadic eye movements. Dissertation 
Abstr., 1955, 15, 1446. 

Dixon, W. R. Studies of the eye-move- 
ments in reading of university profes- 
sors and graduate students. In Studtes 
in the psychology of reading. University 
of Michigan Monographs in Education, 
No. 4. Ann Arbor: Univer. of Michigan 
Press, 1951, 113-178. 

Dunn, L. M. Acomparison of the reading 
processes of mentally retarded and nor- 
mal boys of the same mental age. In 
Studies of reading and arithmetic in 
mentally retarded boys. Child Develop- 
ment Publications of the Society for 
Research in Child Development, 
Monog. Serial No. 58, No. 1. Lafayette 
Ind.: Child Development Publications, 
1954, 19, 7-99. 

EpGincton, E. S$. Interocular transfer 
with control for conjugate eye-move- 
ment. Dissertation Abstr., 1955, 15, 
1656-1657. 

. Fits, P. M. Use of the eyes in reading in- 

struments. O-Eye-O, 1951, 17, 11-15. 

. GEMELLI, A., Cocompt, C., & SCHUPFER, 

R. E. L’enregistrement electrique des 

mouvements oculogyres et ses applica- 

tions. Cont. Lab. Psicol. Milano: 

Univer. Sacro Cuore, 1952, Ser. 15, 26- 

36. 

. GERATHEWORL, S. J., & StTRUGHOLD, H. 

Motoric responses of the eyes when ex- 

posed to light flashes of high intensities 

and short duration. J. Aviat. Med., 

1953, 24, 200-207. 

. Gitpert, L. C. Functional motor effi- 

ciency of the eyes and its relation to 

reading. Univ. Calif. Publ. Educ., 1953, 

11 (3), 159-232. 

. Gtockx, M. D. Effect upon eye move- 
ments and reading rate at the college 
level of three methods of training. 
J. educ. Psychol., 1949, 40, 93-106. 

Gray, W. S. Summary of reading in- 
vestigations. J. educ. Res., 1946, 39, 
401-433; 1947, 40, 401-435; 1948, 41, 
401-435; 1949, 42, 401-437; 1950, 43, 
401-437; 1951, 44, 401-441; 1952, 45, 
401-435; 1953, 46, 401-437; 1954, 47, 


. Loreutst, L. 


TINKER 


401-439; 1955, 48, 401-442; 1956, 49, 
401-436; 1957, 50, 401-441. 


. Gray, W.S. A study of reading in four- 


teen languages. In The teaching of read- 
ing and writing: An international Study. 
Chicago: Scott Foresman, 1956. 

HaAckMAN, R. B., & Tinker, M. A. Ef- 
fect of variations in color of print and 
background upon eye movements. 
Amer. J. Optom., 1947, 34, 354-359. 

HartripGE, H., & THomson, L.C. Meth- 
ods of investigating eve movements. 
Brit. J. Ophthal., 1948, 32, 581-591. 

Henry, L. K., & Laver, A. R. A com- 
parison of four methods of increasing 
the reading speed of college students. 
Proc. Iowa Acad. Sci., 1939, 46, 273 
276. 

HittMAN, H. H. The photographic study 
of children’s eye-movements during 
reading. Res. Rev., 1955, No. 6, 27-39. 

HorrmMan, A. C. Eyve-movements during 
prolonged reading. J. exp. Psychol., 
1946, 36, 95-118. 


. Jiménez-HERNANDEZ, A. Psicologia de la 


lectura: investigacién del proceso oculo- 
motor. [The psychology of reading: in- 
vestigation of the oculomotor process.] 
Pedagogia, 1954, 2 (1), 29-45. 

KLEITMAN, N., & SCHREIDER, J. E. 
Diurnal variation in oculomotor per- 
formance. Année psychol., 1951, 50, 
202-215. | 

Laycock, F. Significant characteristics of 
college students with varying flexibility 
in reading rate. I. Eye-movements in 
reading prose; Il. Motor and percep- 
tual skill in ‘treading’? material whose 
meaning is unimportant. J. exp. Educ., 
1955, 23, 311-330. 


. Leppetter, F. G. Reading reactions for 


varied types of subject matter: an 

analytical study of the eye movements 

of 11th grade pupils. J. educ. Res., 

1947, 41, 102-115. 

Eye movements in a special 
reading situation. M.A. thesis, Univer. 
of Minnesota, 1941. 

Lorp, M. P. Measurement of binocular 
eye movements of subjects in the sitting 
position. Brit. J. Ophthal., 1951, 35, 
21-30. 


. Lorp, M. P., & Wricut, W. D. Eye 


movements during monocular fixation. 
Nature, 1948, 162, 25-26. 

Lorp, M. P., & Wricut, W. D. Eye 
movement research. Light and Lighting, 
1949, 42, 309-312. 


. LunpBerG, N. Electric registration of 





43. 


44. 


£5. 


46. 


J 
—* 


. SCIPIONE, A, M. 


. StMonson, E., & BrozeK, J. 


. Stmonson, E., & BrozeKk, J. 


. SOMMERFELD, R. E. 


RECENT STUDIES OF EYE 


eye movements. Acta Otolaryng. 1941, 
29, 451-455. 

MANOLAKES, G. The effects of tachisto- 
scopic training in an adult reading pro- 
gram. J. appl. Psychol., 1952, 36, 410- 
412. 

Marc, E, Development of electro-ocu- 
lography. A.M.A. Arch. Ophthal., 1951, 
45, 161-185. 

Morse, W. S. A comparison of the eye- 
movements of average fifth and seventh 
grade pupils’ reading materials of corre- 
sponding difficulty. In Studies in the 
Psychology of reading. University of 
Michigan Monographs in Education, 
No. 4. Ann Arbor: Univer. of Michigan 
Press, 1951, 1-64. 

Munoz, J. M., Oporiz, J. B., & TAvazza, 
J. Registro de los movimientos occu- 
lares durante la lectura. Rev. sociedad 
Argentina Biologia, 1944, 20, 280-2860. 

PaTERSON, D. G., & TInKER, M. A. The 
effect of typography upon the _ per- 
ceptual span in reading. Amer. J. 
Psychol., 1947, 60, 388-396. 

Perry, W. G., & WitLtock, C. P. A 
clinical rationale for a reading film. 
Harv. educ, Rev., 1954, 24, 6-27. 

Powsner, E. R., & Lion, K. S. Testing 
eve muscles. Electronics, 1950, 23, 96- 
99. 

RATLIFF, F., & Riccs, L. A. Involuntary 
motions of the eye during monocular 
fixation. J. exper. Psychol., 1950, 40, 
687-701. 

Riccs, L. A., Ratiirr, F., CORNSWEET, 
J. C., & Cornsweet, T. N. The dis- 
appearance of steadily fixated visual 
test objects. J. opt. Soc. Amer., 1953, 
43, 495-501. 

SALATINI, R. W. Behavior of vision. 
Optom. Wkly., 1953, 44, 1725-1729. 
Eye movements as re- 
lated to reading. 


1953, 27, 3+. 


Columbia Optom., 


Effects of 
illumination level on visual performance 
and fatigue. J. opt. Soc. Amer., 1948, 
38, 384-397. 

The effect of 
spectral quality of light on visual per- 
formance and fatigue. J. opt. Soc. 
Amer., 1948, 38, 830-840. 

An evaluation of the 
tachistoscope in reading improvement 
programs. Third Year-book of the 


Southwest Reading Conference for Col- 


. Sez. Ht. 


. WATERMAN, J. T. 


wT 


Nw 


MOVEMENTS IN READING 231 


leges and Universities. Fort Worth: 

Texas Christian Univer. Press, 1954, 

7-25. 

The Lifwynn eye-movement 
camera. Science, 1946, 103, 628-629. 
TEN DoesscHaTE, G., & LANSBERG, 
M. P. Time consumption in eye move- 
ments. Ophthalmologica, 1954, 128, 

298-300. 

Tittson, M. W. Changes in eye-move- 
ment pattern. J. higher Educ., 1955, 26, 
442-445; 458. 

TINKER, M. A. 
eye movements. 
1931, 43, 115-118. 

Tinker, M. A. Eye movements in read- 
ing. J. educ. Res., 1936, 30, 241-277. 

TINKER, M. A. The study of eye move- 
ments in reading. Psychol. Bull., 1946, 
43, 93-120. 

TiInKER, M. A. Time relations for eye- 
movement measures in reading. J. 
educ. Psychol., 1947, 38, 1-10. 

linker, M. A. Fixation pause duration 
in reading. J. educ. Res. 1951, 44, 471- 
479, 

Tinker, Ml. A. Perceptual and oculo- 
motor efficiency in reading materials in 
vertical and horizontal arrangement. 
Amer, J. Psycheol., 1955, 68, 444-449. 

TINKER, M. A., & Paterson, D. G. The 
effect of typographical variations upon 
eye movements in reading. J. educ. 
Res., 1955, 49, 171-184. 

rRAXLER, A. E., & TOWNSEND, A. An- 
other five years of research in reading. 
New York: Educational Records Bu- 
reau, 1946. 

TRAXLER, A. E., & TownsenpD, A. Eight 
more years of research in reading. New 
York: Educational Records Bureau, 
1955. 

Watton, H. N. Vision and rapid reading. 
Amer. J. Optom., 1957, 34, 73-82. 

Reading patterns in 

German and English. The German 

Quart., 1954, 26, 225-227. 


Apparatus for recording 
Amer. J. Psychol., 


. WeEsTHEIMER, G. Mechanisms of saccadic 


eye movements. A.M.A. Arch. Ophthal., 
1954, 52, 710-723. 

Westover, F. L. Controlled eye move- 
ments versus practice exercises in read- 
ing: A comparison of methods of im- 
proving reading speed and comprehen- 
sion. Teach. Coll. Contr. Educ., 1946, 
No. 917. 


Rece ived December 20, 1957. 








PSYCHOLOGICAL BULLETIN 
Vor. 55, No. 4, 1958 


SEX-ROLE DEVELOPMENT 


IN A CHANGING CULTURE! 


DANIEL G. BROWN 


United States Air Force Academy 


One of the more significant psycho- 
social developments of contemporary 
American society would appear to be 
the relatively fluid state of the sex 
roles of individuals. Within a single 
generation, significant changes have 
taken place in the traditional concep- 
tions of what is masculine and what is 
feminine. Whether such changes 
have been abrupt enough to be con- 
sidered a cultural revolution or suf- 
ficiently gradual to be simply degrees 
of cultural variation is difficult to 
judge. In either case, however, this 
changed and changing cultural pat- 
tern has a number of implications and 
possible effects that bear directly on 
individual, group, and institutional 
behavior. In this connection such 


questions as the following might be 


asked: What are some of these 
changes that have taken place in the 
sex roles? Have such changes been 
more pronounced in the feminine role 
than in the masculine role? How 
have these changes affected the life 
adjustment of individuals? And the 
relationships of the sexes with each 
other? What about the effect on boys 
and girls at the present time and in 
the years ahead? These are just a few 
of the problems in the area of mascu- 
linity-femininity development and ad- 
justment that need to be studied and 
investigated. 

The present paper is primarily di- 


1 This paper was read at a symposium on 
“Psychological Implications of Changing Sex 
Roles,’”’ at the annual meeting of the APA, 
New York, September, 1957. Acknowledge- 
ment with thanks is made to Ruth E. Hartley, 
City College of New York, and to G. D. 
Ofiesh, United States Air Force Academy, for 
helpful suggestions concerning this paper. 


232 


rected toward a consideration of the 
nature and theoretical implications of 
sex-role development in children. 


DIFFERENTIATION OF SEX AND 
SEX ROLE 


As a starting point, consideration 
might be given to the age at which 
the child becomes aware of biological 
sex differentiation per se as well as 
when the child becomes aware of the 
essential meaning of ‘‘masculine’’ and 
“feminine,” i.e., sex-role behavior.? 
At what age for example is the aver- 
age child able to distinguish between 
the sexes and to distinguish himself 
or herself as a boy or girl? Evidence 
suggests that between two thirds and 
three fourths of children by the age 
of three are able to make this basic 
distinction (12, 13, 31). 

Evidence also suggests that sex- 
role differentiation is a gradual proc- 
ess, probably beginning in the second 
year of life and becoming definitely 
established by the age of three (18, 
30, 31). By or during the fifth year 
most children make a clear differen- 
tiation between the more obvious 
biological cues of maleness and fe- 
maleness and psychological cues of 
masculinity and femininity (1, 3,9, 12, 
13, 18, 20, 21, 26, 30, 31). Asin the 
other aspects of psychological de- 
velopment, there are undoubtedly 
wide individual differences in the 
clarity with which differences be- 


2 The concept, sex role refers to those psy- 
chological characteristics and behavioral pat- 
terns that are typical of one sex in contrast: to 
the other sex. The sex role of a person consists 
of the behavior that is socially defined and 
expected of that person because of his or her 
status as a male or female. 





SEX-ROLE DEVELOPMENT IN A 


tween the sexes are perceived by 
children. 

In any event, whatever the exact 
age in a particular case, it seems safe 
to conclude that preschool children 
as a group become fully aware of the 
fact that the world is divided into 
two groups of people and that, de- 
pending on whether one belongs to 
one group or the other, different be- 
havior patterns are expected accord- 
ingly. At an early age, then, children 
are being conditioned to and are 
actively acquiring their sex roles. 
One of the most important considera- 
tions here has to do with the meaning 
and significance to the child of the 
earliest perceptions of structural and 
sex-role differences between boys and 
girls. What does it mean to a child 
to become aware of his sex for the 
first time, and gradually, his sex 
role? For the child to feel safe, secure, 
and satisfied in his emerging sexual 
identity would appear to be one of 
the most important conditions in his 
entire development. 


SEX-ROLE PREFERENCE IN 
CHILDREN 


Related to the factor of age in sex 
and sex-role differentiation in chil- 
dren is the phenomenon of sex-role 
preference. Does preference for one 
sex role over the other parallel the 
developing awareness of the differ- 
between the masculine and 
feminine roles?) Or does preference 
come later, only after the child has 
been exposed sufficiently to the dif- 
ferential treatments accorded boys 
in contrast to girls? The origin and 
earliest occurrence of sex-role prefer- 
ence is a problem that awaits research 
investigation. That definite prefer- 
ences exist in young children for one 
or the other sex role, however, has 
been reasonably well demonstrated 
by several studies (1, 3, 9, 21, 26). 


ence 


CHANGING CULTURE 233 


This problem has been investigated 
by the present writer by means of a 
technique known as the Jt Scale for 
Children (2), a scale composed of 36 
picture cards, three by four, of ob- 
jects and figures typically associated 
with the masculine or feminine roles 
in our culture (e.g., preferring to play 
with a tractor rather than a doll; 
wearing a dress rather than trousers; 
preferring to be a boy rather than a 
girl, etc.). A child-figure called “‘It,”’ 
relatively ambiguous as to sexual 
identity, is used in administering the 
scale by having each child make 
choices for It, rather than the child 
himself or herself making the choices 
directly. Results based on the use of 
the It Scale with children between 
the ages of about 3} and 11}, most 
of whom were from middle class 
homes, show that beginning with the 
youngest preschool group (Ages 3} to 
53) and extending through the fourth 
grade (Ages 9} to 103) boys express a 
stronger preference for the masculine 
role than girls do for the feminine 
role (1, 3, 17, 21). For example, at 
the kindergarten and_ third-grade 
levels, about 85° and 95° of the 
boys respectively indicate that It 
would rather be an ‘Indian Chief” 
than an “Indian Princess.’”” And 
when asked which shoes It would 
rather “dress up and play house in,” 
about 75° and 95% of the kinder- 
garten and third-grade boys respec- 
tively chose men’s rather than 
women’s shoes. 

Girls between the ages of 3} and 
6} are quite heterogeneous as a 
group: some are predominantly femi- 
nine, choosing practically all of the 
feminine alternatives; others are pre- 
dominantly masculine, and __ still 
others are “in-betweens,’’ choosing 


’ 


both masculine and feminine alterna- 
tives. Taken as a group, for example, 
50% express a preference for It 








234 


’ 


“playing grownups” with cosmetic 
articles and 50°; with shaving arti- 
cles. 

After about the sixth year and ex- 
tending through the ninth year, most 
girls show a very strong preference 
for masculine in contrast to feminine 
things. For example, between 60% 
and 70% of the girls in the first, 
second, third, and fourth grades indi- 
cate that It would rather work with 
“building” tools than with ‘cooking 
and baking”’ utensils. 

It is not known whether girls in the 
fifth grade and beyond (age group 
from about 10 to 11 and older) be- 
come less masculine in preference. 
Brown's study (3) of fifth-grade sub- 
jects indicated a definite feminine 
changeover in girls, but Hogan (19) 
failed to find any such change in the 
preference patterns of either fifth- or 
sixth-grade subjects. The whole 
problem of change in sex-role prefer- 
ence in relation to age needs further 
and more intensive study. 

In contrast to girls, boys at all ages 
show a strong preference for the 
masculine role. This preference is 
evident in the youngest group (ages 
33 to 53) and becomes even stronger 
until it reaches @ near maximum at 
about the age of eight and thereafter. 
Thus, between 90° and 95 % of boys 
in the second, third, fourth, and fifth 
grades indicate that, given a choice, 
It would rather wear a shirt and 
trousers than a dress. 


SEX-ROLE PREFERENCE IN 
CHILDREN COMPARED TO 
ADULTS 

To what extent are the sex-role 
preference patterns of children simi- 
lar to those of adults? For compara- 
tive purposes the Parental Role sec- 
tion of the It Scale may be used (3). 
This section involves asking the child 
whether It would rather be a mother 


DANIEL G,. BROWN 


or a father. Results from this section 
may be summarized as follows: From 
about 80°; to 95° of boys at all ages 
from kindergarten through the fifth 
grade express a preference for It be- 
coming a father, only 5° to 20°; for 
It becoming a mother. On the other 
hand, in the case of girls from kinder- 
garten through the fourth grade, only 
about 25% to 45% express a prefer- 
ence for It becoming a mother, while 
between 55° and 75% for It becom- 
ing a father. 

These results in the case of children 
are quite consistent with studies of 
adults in our culture which asked 
men and women: “Have you some- 
times wished you were of the opposite 
sex?”’ or “If you could be born over 
again, would you rather be a man or 
a woman?” or “Have vou ever wished 
that you belonged to the opposite 
sex?’’ Results may be summarized 
as follows: only between 23°% and 4% 
of adult men compared to between 
20% and 31% of adult women recall 
consciously having been aware of the 
desire to be of the opposite sex (10, 
11, 34). And in Puerto Rico only 33% 
of a group of adult female students 
compared to about 93°) of male stu- 
dents indicated they would prefer to 
be female and male respectively if 
they “could come to life again after 
death” (28). This lopsided preference 
for being male in preference to being 
female is also reflected in a recent 
survey of several hundred university 
students at Ohio State University 
who were asked whether they would 
rather have a male or female child in 
their family if they could have only 
one child (8). The results showed 
that 91° of the men and 66% of the 
women students expressed a prefer- 
ence for a male child. When both 
groups are combined, boys were pre- 
ferred by approximately 75% and 
girls by only 25% of these students. 








A significant problem connected 
with these findings concerns the psy- 
chological effect on large numbers of 
women who openly admit having 
preferred to be male. How does such 
awareness affect the self-concept of a 
girl or woman? The result, according 
to White (35) is to undermine a 
woman’s respect for herself as a 
woman and to derogate the feminine 
role in general. 

An important anthropological 
analysis in connection with sex dif- 
ferences in acceptance of appropriate 
sex roles would be a cross-cultural 
comparison of the percentage of men 
compared to women who had pre- 
ferred to be of the opposite sex. Com- 
pared to those cultures, for exaniple, 
where male domination reaches exag- 
gerated proportions, very different 
results might be expected among the 
Burmese (7), Ojibwa Indians (33) 
and Tchambuli (23) where females 
have relatively high status and a 
favorable position in their society. 


FACTORS RELATED TO MASCULINE 
ROLE PREFERENCE 

What factors are functionally re- 
lated to the much greater preference 
that boys show for the masculine 
role than girls show for the feminine 
role and for the definite preference 
that many girls show for the mascu- 
line role? Although this is a problem 
in relation to which much research 
is needed, several conditions or fac- 
tors may be suggested as contribu- 
tory. 

First, there is the emphasis by 
Freud on the anatomical difference 
between males and females, the effect 
of which is supposed to make the boy 
proud of his status and the girl dis- 
satisfied with hers. Having versus 
not having a penis allegedly ‘“‘ex- 
plains’’ why girls as well as boys 
prefer to be boys. 


SEX-ROLE DEVELOPMENT 


IN A CHANGING CULTURE 235 


Another attempt to account for sex 
differences in role preference is the 
emphasis by Adler on sociocultural 
advantages that go with being male in 
contrast to being female. The little 
girl may early perceive the greater 
and numerous privileges 
connected with the masculine role. 
This would tend to arouse envy and 
drive her in the direction of wanting 
that which she does not have, 
namely, masculine status. Adler 
introduced the concept of “‘masculine 
protest’”’ to refer to this phenomenon. 
That our culture has been and still ts 
masculine-centered and masculine- 
oriented is obvious. The superior 
position and privileged status of the 
male permeates nearly every aspect, 
minor and major, of our social life. 
The gadgets and prizes in boxes of 
breakfast cereal, for example, com- 
monly have a_ strong masculine 
rather than feminine appeal.* And 
the most basic social institutions per- 
petuate this pattern of masculine ag- 
grandizement. Thus, the Judeo- 
Christian faiths involve worshipping 
God, a ‘“‘Father,’’ rather than a 
“NMother,’”’ and Christ, a ‘‘Son,”’ 
rather than a ‘‘Daughter.”’ 

A third factor relative to the dif- 
ference between the sexes in role 
preference is the greater latitude of the 
girls compared to the boys in sex-role 
development. It appears somewhat 
paradoxical that, although restricted 
much more in practically all other 
respects, girls are allowed more free- 
dom than boys in sex-role learning. 
This is, however, simply consistent 
with the idea that masculine status 
is so superior to feminine status that 
many girls are not even discouraged 
from striving to attain the former. 


prest ige 


3 Typical examples include: military equip- 
ment, cowboy paraphernalia, police badges, 
airplanes, boats, trains, spaceships, marbles, 
yo-yoes, miniature auto license plates, etc. 








236 


For a girl to be a tomboy does not 
involve the censure that results when 
a boy is a sissy. With little, if any, 
embarrassment or threat, girls may 
show strong preference for the mas- 
culine role; this is not true in the case 
of boys. 

Further evidence of the fact that 
girls in contrast to boys not only have 
much more opportunity to pattern 
their behavior after the model of the 
opposite sex but in many cases ac- 
tually do so is cited by Cunningham 
(6). She reports on a group of fourth- 
and fifth-grade students who, when 
asked to describe what they consider 
to be some of the “pressing problems 
in human relations’? included the 
following: ‘‘How can I stop my sister 
from being a tomboy?” Other exam- 
ples that may be cited include: 

1. Clothing. Girls may wear shirts 
and trousers with little or no social 
disapproval, but boys do not wear 
skirts or dresses; in fact, men who 
wear feminine clothing, i.e., trans- 
vestites, do so at the risk of severe 
social censure and even legal punish- 
ment. 

2. Names. Many girls are given 
masculinized names such as Jackie, 
Stephanie, Billie, Pauline, Jo, Ro- 
berta, Frankie, etc., but few boys are 
given feminized names. 

3. Toys and play activities. Girls 
may play with any or all of the toys 
typically associated with boys (e.g., 
cars, trucks, erector sets, guns, etc.) 
but boys are discouraged from play- 
ing with toys that are considered 
feminine (e.g., dolls, dishes, sewing 
materials, etc.). 

Goodenough (14, p. 318) has com- 
mented on the greater freedom of 
girls in sex-typed play as follows: “A 
boy is not likely to be a Dale Evans, 
but a girl often becomes Roy Rogers, 
or any of his masculine colleagues. 
Boys are rarely glamour girls, but 


DANIEL G. BROWN 


many little girls fall eagerly into the 
roles of space men, or masculine 
rough riders.”’ 

Based on research findings that 
show boys consistently making more 
appropriate sex-typed choices than 
girls, Rabban (26) and Hurlock (20) 
conclude that ‘‘boys are more aware 
of sex-appropriate behavior than 
girls.” Rather than being ‘more 
aware” than girls, however, it is the 
relative lack of flexibility of boys in 
sex-role choices that probably ac- 
counts for some of the difference be- 
tween boys and girls in this regard. 
Boys simply do not have the same 
freedom of choice as girls when it 
comes to sex-typed objects and ac- 
tivities. In this connection, Hartley‘ 
raises the question as to whether or 
not results of studies of sex-role 
preference in children, rather than 
measuring role preference as such, 
might not simply reflect the fact that 
girls are given much and boys little 
opportunity for variation in express- 
ing preferences for sex-typed objects 
and activities. This is a good point 
and should be explored further. 

As to the basis of the narrow, rigid 
sex-typing pattern in males, Good- 
enough (14) presents evidence that 
suggests fathers show greater concern 
than mothers for sex-appropriate be- 
havior in their children. In other 
words, father _is more likely than 
mother to insist that “junior’’ look 
and talk and act like a man. This 
pattern, which would tend to have 
greater impact on the boy than the 
girl, is consistent with findings pre- 
sented in the present paper, showing 
boys are much more likely than girls 
to make sex-appropriate choices. 

Related to these differences in sex 
roles in childhood appears to be a 


‘ 


Ruth E. 


4 Personal communication from 
Hartley. 








parallel difference in adult occupa- 
tional roles. Even though women 
traditionally have been subject to 
various kinds of vocational and eco- 
nomic discrimination, it is still true 
that a woman may and does enter a 
“‘masculine’”’ vocation or profession, 
e.g., bus driver, engineer, lawyer, 
etc., with less social disapproval or 
concern as to one’s sex-role ‘‘normal- 
ity’ than a man who 
“feminine” field, e.g. 


dress designer, nurse, etc. 


enters a 
hair stylist, 
The census 
in 1950, for example, revealed that 
women are now in all of the 446 
occupations reported by the census. 
Among the 16,000,000 American 
women employed, there are “lady” 
carpenters, tractor drivers, 
pilots, telephone linesmen, locomo- 
tive engineers, lumbermen, firemen, 


sailors, 


and even stevedores and longshore- 
men! 


SEX-ROLE IDENTIFICATION AND 
SeEx-ROLE PREFERENCE 

In dealing with the complex prob- 
lem of sex-role behavior it seems par- 
ticularly important to distinguish be- 
tween sex-role identification and sex- 
role preference (1). Identification is 
the basic process in which a child, at 
first involuntarily, and later con- 
sciously, learns to think, feel, and act 
like members of one sex in contrast 
to the other sex. Preference refers to 
the tendency to adopt the sex role of 
one sex in contrast to that of the 
other sex, the former being perceived 
as more desirable and attractive. 
With this distinction in mind it is 
possible to delineate three major sex- 
role patterns: (a) Identification with 
and preference for the sex role of one’s 
own sex., e.g., a girl may identify 
with and prefer the feminine role; (d) 
Identification with the sex role of 


one’s own sex but preference for the 
sex role of the opposite sex, e.g., a 


SEX-ROLE DEVELOPMENT IN A CHANGING CULTURE 


237 


girl may identify with the feminine 
role but prefer the masculine role; 
(c) Identification with the sex role of 
the opposite sex but preference for 
the sex role of one’s own sex, e.g., < 
girl may identify with the masculine 
role but prefer the feminine role. Of 
the two processes, identification ap- 
pears to be primary, while preference 
is more or less secondary relative to 
sex-role behavior. In normal develop- 
ment the two form a single, integra- 
tive process. 

In view of the finding that mascu- 
line role preference appears to be 
widespread among girls, it might be 
hypothesized that conflict or confu- 
sion will be conspicuous in’ their sex- 
role development. ‘Thus, the fact 
that girls are destined for feminine 
functions in adulthood, yet envy and 
attempt to emulate the masculine 
role in childhood would tend to pro- 
duce ambivalence and a lack of 
clarity in the feminine role (16, 24, 
31). On the basis of a study of sex- 
role learning in five-year-olds, for 
example, Fauls and Smith (9) refer to 
the “lack of clear definition” of a sex 
role in the case of female children. 
Related to this is the contradiction 
between the sex-role identification of 
many girls with the feminine model 
and the tendency for them to prefer 
the masculine role. 

On the other hand, boys do not 
necessarily escape difficulties in sex- 
role development. Even though the 
culture greatly favors the male, the 
fact that boys must shift from an orig- 
inal identification-attachment with 
the mother fo an identification with 
the father may create difficulties 
for boys that girls do not experience 
(30). Thus, Sears (30) reports that 
six-year-old boys have not identified 
with their fathers as well as girls have 
with their mothers. On the basis of 
extensive observations of children in 








238 


preschools, Hartley arrives at a con- 
clusion similar to that of Sears and, 
in addition, raises the question as to 
whether many boys really experience 
their fathers in their paternal role. 
She also questions whether many 
boys even picture themselves as 
“future fathers’’ (18). 

It is also true that a considerable 
number of boys get overly exposed 
to the feminine model in early life 
when the mother is much more 
prominent in the life of the child than 
the father. This is especially likely 
to occur if for any reason the father 
is psychologically distant or a pre- 
dominantly negative figure for the 
son and there is no adequate sub- 
stitute. 

According to Parsons (25) and 
Gorer (15) a major effect of the situa- 
tion in which the father is typically 
away most of the time while the 
mother is around continually exem- 
plifying the feminine model is to 
facilitate the role development of the 
girl and to complicate the role de- 
velopment of the boy. These writers 
seem to emphasize the quantity of 
the parent-child relationship rather 
than the quality of such a relation- 
ship. In other words, the degree that 
the child respects, admires and loves 
the parent may be much more sig- 
nificant than the sheer amount of 
contact, per se.® 


SEX-ROLE DEVELOPMENT AND 
ADULT SEXUAL ADJUSTMENT 


A boy who incorporates the basic 
features of the feminine model via 
predominant identification with the 
mother intrinsically will feel most 
comfortable in the feminine role, 
which to him is ‘‘normal’’ and 
“‘natural.’’ Such a boy will show a 
“feminine protest,”’ i.e., he will pro- 

5 Acknowledgement is made to L. E: 


Dameron for making this point in discussion 
with the writer. 


DANIEL G. BROWN 


test any restriction of his desire and 
effort to become thoroughly feminine. 
He will often plead and even demand 
the freedom to adopt the feminine 
role (27). This is the developmental 
pattern in childhood that seems to 
provide the basis for sex-role inver- 
sion in adulthood. In fact, inversion 
refers precisely to the adoption of the 
basic behavior patterns that are char- 
acteristic of the opposite sex (4a). 

In cases of males that do not in- 
volve a relatively complete inversion 
of sex role but do show considerable 
feminine identification, the result 
may be boys who become rebellious 
and develop strong defensive reac- 
tions in the form of extreme aggres- 
siveness as a means of attempting to 
counteract their underlying inverted 
tendencies. MacDonald (22) has 
presented a number of cases of 
“effeminate” boys who developed 
pathological aggressive reactions. 

Although direct evidence is limited 
it appears that the child’s eventual 
sexual orientation and adjustment in 
adolescence and adulthood bears a 
direct relationship with the nature of 
his sex-role development in child- 
hood. Adult sexual behavior, at least 
in part, appears to be an outgrowth 
of the individual’s underlying sex 
role. Thus, a normal male is one who 
has identified with, incorporated, and 
prefers the masculine role; his sexual 
desire for the female is one aspect of 
this role. A boy who has identified 
with, incorporated, and prefers the 
feminine role will most likely desire a 
male as a sexual partner in adulthood 
in keeping with the inverted role pat- 
tern. The problem of normal and 
inverted sex-role development has 
been discussed in another paper (4). 
SEX-ROLE CONVERGENCE: A NEW 

CULTURAL PATTERN EMERGING? 


Despite the fact that boys, much 
more than girls, show a concern for 








SEX-ROLE DEVELOPMENT 


behaving along sex-appropriate lines, 
there has been considerable change in 
the direction of both masculine and 
feminine roles becoming broader, 
less rigidly defined, less sex-typed, 
and more overlapping with each 
other. As Seward (32, p. 175) ob- 
serves, ““Today in the post-World 
War II United States, there is a good 
deal less self-consciousness about sex 
roles and probably more freedom of 
choice for the individual than ever 
before.”’ In line with this observation 
is a new course in domestic arts for 
eighth-graders in a public school in 
Jersey City, New Jersey, in which 
boys learn how to cook, sew, and be- 
come “efficient housewives,’ and in 
which girls learn how to handle 
“‘man-sized_ tools,’’ do woodwork, 
plumbing repairs, and become the 
‘“‘man-of-the-house.’’ This course is 
described as so successful that the 
sexes may be switched in all eighth- 
grade homemaking and shop courses 
in the Jersey City svstem. The same 
type of course has been established 
recently in a junior high school in St. 
Petersburg, Florida. And in the 
public senior high schools in Denver, 
Colorado, courses in cooking for 
boys, metal crafts and lathe work for 
girls, and child care and training for 
both boys and girls are offered. 
Other indications of the trend to- 
ward increasing similarity of sex roles 
include: (a) similarity of educational 
experiences of girls and boys from 
kindergarten through the secondary 
school system; (5) husbands doing 
the dishes, cleaning the house and 
carrying out other domestic tasks 
historically considered exclusively 
‘feminine’; (c) wives holding down 
jobs outside the home, many of 
which have been traditionally ‘‘mas- 
culine’’; and (d) the apparel of boys 
and men that emphasize color, soft- 
ness, and more delicate features 
along with the adoption by girls and 


“IN A CHANGING CULTURE 


239 


women of all kinds of ‘“‘masculine’”’ 
clothing, hair styles, etc. 

Mead (24) and Seward (32) have 
pointed out that this greater flexi- 
bility in sex-role learning makes for 
increased interfamily variability and, 
hence, increasing cultural diversity 
in this regard. Is it still possible, in 
our culture for example, to speak of 
the feminine role or the masculine 
role? Or is it necessary to refer to 
various roles? Thus, within a single 
neighborhood, the role of the hus- 
band-father in one home involves 
almost absolute control, while the 
role of the wife-mother is strictly sub- 
servient and dependent. Next door, 
the dominating control of the family 
may be maintained by the wife- 
mother, while the husband-father is 
little more than a financiaily con- 
venient ‘“‘boarder.’’ Across the street 
there may be hostile competitiveness 
and a continual ‘“‘power struggle” 
between the husband-father and the 
wife-mother, each at times emerging 
“victorious,” the other ‘‘defeated.”’ 
And, in still another home, the re- 
spective roles of husband-father and 
wife-mother are largely complemen- 
tary and equalitarian rather than 
hierarchical. What must be the effect 
of these very different parental role 
patterns on the sex-role identifica- 
tions and preferences of children who 
are developing in these respective 
familial environments? For example, 
how is the process by which a boy 
becomes like his father (i.e., ‘‘a man’’) 
influenced by the various role struc- 
tures in such families? It is plausi- 
ble that degree of ease and normality 
or difficulty and abnormality is di- 
rectly related to the particular pa- 
rental role relationships. Intensive 


study in this area is very much 
needed. 

Finally, on a culture-wide level, the 
rapid changes in the sex roles of the 
Japanese during the past decade 








240 


might be cited.6 Among other con- 
tributing factors, the cultural dif- 
fusion stemming from American oc- 
cupation of Japan has brought about 
far-reaching changes, particularly in 
the feminine role. In a country that 
gave rise to the expression “‘as un- 
important as a Japanese woman,” the 
traditional and relatively complete 
subordination of the female to the 
male appears to be on the way out 
and is being replaced by a status of 
women that is beginning to approach 
that of men. This trend is reflected 
not only in the fact that women can 
now vote, an unheard of practice ten 
years ago, but also in the hopes and 
aspirations of Japanese children as 
revealed in their drawings. When 
asked to draw pictures depicting 
what they wanted to be when they 
were grown, many girls drew pictures 
of teachers, secretaries, industrial 
workers, beauticians, scientists, etc. 

A somewhat parallel development 
to that in Japan has been taking 
place in Germany during the past 
decade or so.’ Here, too, feminine 
status has undergone marked change 
in the direction of greater freedom 
and opportunity for women in the 
educational and economic spheres. A 
continuing sociopsychological analy- 
sis of such significant and rapid 
changes in the feminine sex role of the 
Japanese and Germans should be 
very informative and valuable, espe- 
cially in terms of the impact on the 
present and future generation of 
children. 


SUMMARY 


The young child, as early as the 
second year of life, begins to dis- 
tinguish between male and female 
and between masculine and feminine. 
Preference for one sex role or the 


6 Life Magazine, March 29, 1954, 36, 89-95. 
7 Life Magazine, May 10, 1954, 36, 107-112. 


DANIEL G. BROWN 


other also begins to emerge early in 
the life of the child, probably by the 
third year. 

Beginning at the kindergarten level 
and extending through the fourth 
grade, boys show a much stronger 
preference for aspects of the mascu- 
line role than girls show for aspects of 
the feminine role. In fact, a majority 
of girls in Grades 1 through 4 ex- 
press greater preference for mascu- 
line things than for feminine things. 
These results are based on the It 
Scale for Children, a masculinity- 
femininity projective technique for 
use with young children. 

The finding that girls more than 
boys show a preference for the rok 
of the opposite sex is paralleled by 
studies of adults in our culture which 
reveal that between five and twelve 
times as many women as men recal! 
having wished they were of the oppo- 
site sex. 

As to the basis of masculine role 
preference in both sexes, three factors 
are mentioned: (a) the Freudian em- 
phasis on the anatomical differences 
between males and females; (6) the 
Adlerian emphasis on sociocultural! 
favoritism of the male compared to 
the female; and (c) the fact that the 
girl has more latitude than the boy 
in expressing a preference for sex- 
typed objects and activities. 

A child may identify with and 
prefer the sex role appropriate to his 
own sex; or he may identify with and 
prefer the sex role of the opposite 
sex; or he may identify with one sex 
role and prefer the other. A distinc- 
tion between sex-role identification 
and sex-role preference is emphasized. 

In some ways girls would appear to 
have a more difficult time than boys 
in sex-role development; in other 
ways the development of boys would 
seem to be more complicated. The 
general problem of sex differences in 








SEX-ROLE DEVELOPMENI 


ease of masculinity-femininity devel- 
opment is discussed. 

Adult sexual adjustment or mal- 
adjustment is related to the nature 
and outcome of sex-role development 
in childhood. 

There are definite signs that a con- 
vergence of the two sex roles gradu- 
ally is taking place in our society. 
This cultural trend is evident in the 
increasing overlap between things 
and activities formerly considered 
“exclusively masculine’ or ‘“‘exclu- 


“IN A CHANGING CULTURE 


241 


sively feminine.’’ A major effect of 
this emerging cultural pattern is 
widespread interfamily variability in 
the sex roles of family members. 

Finally, attention is called to the 
rapid changes in the feminine sex role 
in Japan and Germany during the 
past ten years. Emphasis is placed 
on the need for a continuing socio- 
psychological analysis of sex-role de- 
velopment in such changing cultures 
as those of the Japanese and Germans 
as well as that of our own. 


REFERENCES 


1. Brown, D. G. Sex-role preference in 
young children. Psychol. Monogr., 
1956, 70, No. 14 (Whole No. 421 

2. Brown, D. G. The It Scale for Children. 
Grand Forks, North Dakota: Psycho- 
logical Test Specialists, 1956. 

3. Brown, D. G. Masculinity-femininity de- 
velopment in children. J. 
Psychol., 1957, 21, 197-202. 

4. Brown, D. G. The development of sex- 
role inversion and homosexuality. J. 

Pediat., 1957, 50, 613-619. 

4a. Brown, D. G. Inversion and Homo- 
sexuality. Amer. J.  Orthopsychiat., 
1958, in press. 

5. Conn, J. H., & KANNer, L. 
awareness of sex differences. J. 
Psychiat., 1947, 1, 3-57. 

6. CUNNINGHAM, RutH, Evzt, ANNA, HALL, 
J. A., Farrect, Martie, & Roperts, 
MADELINE. 


consult. 


Children’s 
Child 


Understanding group be- 


havior of boys and girls. New York: 
Sureau of Publications, Teachers Col- 
lege, Columbia Univer., 1951. 

7. DetiGNan, H. G. Burma: gateway to 
China. Washington, D. C., Smith- 


sonian Institution, 1943. 

8. Dinitz, S., Dynes, R. R., & CLARKE, 
A. C. Preference for male or female 
children: traditional or affectional. 
Marriage Fam. Living, 1954, 16, 128 
130. 

9, Fauts, Lypia B., & Smitn, W. D. 
role learning of five-year-olds. J. genet. 
Psychol., 1956, 89, 105-117. 

10. FoRTUNE SuRvVEY, Fortune, August, 1946. 

11. Gattup, G. Gallup poll. Princeton: 
Audience Research Inc., June, 1955. 

2. GESELL, A., HALVERSON, H. M., THOM- 
son, H., Itc, F. L., Castner, B. M., 
Ames, L. B., & Amatrupa, C. S. The 


Sex- 


_ 


first five years of life: the preschool 

4 New York: Harper, 1940. 

13. Gesevt, A., ILG, Frances L., LEARNED, 
]., & Ames, L. B. Infant and child in 
the culture of today: The guidance of 
development in home and nursery school. 
New York: Harper, 1943. 

14. GoopENoUuGH, EvELYN W. Interest in 
persons as an aspect of sex difference 
in the early years. Genet. Psychol. 
Menogr., 1957, 55, 287-323. 

15. Gorer, G. The American people. 
York: Norton, 1948. 

16. Gray, Susan W. Masculinity-femininity 
in relation to anxiety and social accept- 
ance. Child Develop., 1957, 28, 203-213. 

17. Hanpy, G. DD. The sex-role preference 
scale for children: a study of the It- 
figure. Unpublished master’s thesis, 
Univer. Denver, 1954. 

18. HartLey, Rutu, FRANK, L. K. & GoLp- 
ENSON, R. M. Understanding children's 
play. New York: Columbia Univer. 
Press, 1952. 

19. HoGan, R. A. Children’s sex-role pref- 
erence with the It-figure. Unpublished 
master's thesis, Univer. Denver, 1957. 

20. HurLock, ExizaBetH B. Developmental 

New York: McGraw-Hill, 


years. 


New 


psychology. 
1953. 

21. Low, WILLABE P. 
relation to 


Sex of the examiner in 

sex-role preferences in 
kindergarten children. Unpublished 
master’s thesis, Univer. Denver, 1957. 

22. MacDonatp, MartHa W. Criminally 
aggressive behavior in passive effemi- 

boys. Amer. J. Orthopsychiat., 
1938, 8, 70-78. 

23. MEAD, MARGARET. Sex and temperament 

New York: 


nate 


in three primitive socteties. 


New American Library, 1935. 








24. MEAD, MARGARET. 


25. Parsons, T. 


26. RABBAN, M. 


29. SCHEINFELD, A. 


DANIEL G. 


Male and female. 
New York: Morrow, 1949. 

Age and sex in the social 
structure of the United States. Amer. 
sociol. Rev., 1942, 7, 604-616. 

Sex-role identification in 
young children in two diverse social 
groups. Genet. Psychol. Monogr., 1950, 
42, 81-158. 


27. RonGe, P. H. The “feminine protest.” 


Amer. J. indiv. Psychol., 1956, 12, 112- 
115. 


28. SANCHEZ-HipDALGo, E. [The feeling of in- 


feriority in the Puerto Rican female.] 
Rev. Asoc. Maestros, P. R., 1952, 11 (6), 
170-171; 193. 

Women and men. New 
York: Harcourt, Brace, 1944. 


. SEWARD, GEORGENE H. 


34. TERMAN, L. M. 


BROWN 


30. Sears, R. R., Maccospy, ELEANOR E., & 


Levin, H. Patterns of child rearing. 
Evanston, Illinois: Row, Peterson, 
1957. 


31. SEWARD, GEORGENE H. Sex and the secial 


order. New York: McGraw-Hill, 1946. 
Psychotherapy 
and culture conflict. New York: Ronald, 
1956. 


. SHaw, F. J., & Ort, R. S. Personal ad- 


justment in the American culture. New 
York: Harper, 1953. 
Psychological factors 


in marital happiness. New York: 
McGraw-Hill, 1938. 

35. Waite, L., Jr. Educating our daughters. 
New York: Harper, 1950. 


Received November 26, 1957. 











PSYCHOLOGICAL BULLETIN 
Vo-. 55, No. 4, 1958 


CONTENT AND STYLE IN PERSONALITY ASSESSMENT! 


DOUGLAS N. JACKSON 


Pennsylvania State University 


AND SAMUEL MESSICK 


Educational Testing Service and Princeton University 


In personality theory a ubiquitous 
and fundamental distinction may be 
drawn between the interpretation of 
behavior in terms of (a) the content 
of ‘‘needs”’ and of cognitive structures 
generally and in terms of (0) charac- 
teristic styles of response and action. 
The separation of these two compo- 
nents of personality organization has 
taken a variety of forms in the hands 
of different theorists, as in the All- 
port-Vernon (2) Studies in Expressive 
Movements, in Murphy's (47) schol- 
arly discussion of continuity in per- 
sonality structure, in Klein’s (40) dis- 
tinction between needs and control 
processes, and in Vernon’s (54) dis- 
tinction between adaptive and ex- 
pressive behavior. One may legiti- 
mately ask not only what a person 
says or does (the particular content 
of his statements and actions) but 
how he acts (his characteristic mode 
or style of expression). 

What is conceptually a relatively 
sharp distinction is typically blurred 
and confounded in a particular con- 
crete act; the what and how are fused 
in a given goal-directed response. An 
obsequious person indicates his def- 
erence not only by the act of yield- 
ing, but by the tone of his voice in 
performing the yielding act. Because 
content and style are intermixed in a 

1 Portions of this paper were read at a sym- 
posium on “Experimental Approaches to 
Personality Assessment’? at the American 
Psychological Association Meetings in New 
York, 1957. 

The authors express their thanks to Lee 
Sechrest and Riley W. Gardner for comment- 
ing on the content and style of the manuscript. 


given behavior sequence, and_ be- 
cause there is often a theoretical pre- 
dilection for content components, 
style is often overlooked in person- 
ality assessment. Also, the measure- 
ment of content appears to be more 
direct and unambiguous than the as- 
sessment of stylistic dimensions of 
personality. It is possible, for exam- 
ple, to ask a person what his attitude 
is On a given topic, or to draw infer- 
ences about his need patterns from 
his reported likes and dislikes (51). 
The obviousness of such devices, 
while helpful from the viewpoint of 
labeling what one hopes one is meas- 
uring, also permits respondents to 
distort their scores if they so desire 
(32), something which is less likely 
to occur in the assessment of style. 
In considering the general distinc- 
tion between content and style, those 
methods of personality and attitude 
which are based upon 
printed questionnaires of one form or 
another will be emphasized. While 
the complementary constructs of 
content and style have special rele- 
vance to questionnaire items, where 
the response-evoking properties of 
the particular item form may contrib- 
ute markedly to response variance 
above and beyond the contribution of 
content, the distinction might also be 
applied usefully to other areas of per- 
sonality assessment. For example, 
three possible applications are to per- 
ceptual and cognitive style as in the 
work of Thurstone (52), Witkin (58), 
Klein (39, 40), Gardner (21), and 
others (34); to achievement and apti- 


assessment 


243 








244 


tude testing (28, 32, 60); and to the 
perception of personality (2, 38, 54, 
59). 

The present discussion attempts to 
do two things: first, to present some 
evidence showing the important and 
subtle influences upon responses of 
stylistic components of item form; 
and, second, to illustrate how reliable 
measures of potentially useful stylis- 
tic dimensions may be generated 
from characteristic responses to the 
form of personality and attitude 
items as distinct from measures of 
content. 


PERSONALITY STYLE AND 
RESPONSE SET 

Traditionally, responses to a par- 
ticular item or set of items are as- 
sumed to provide information about 
the respondent in terms of the item 
content. If, for example, a person 
agrees with the statement, ‘‘Under no 
conditions is war justified,’ or an- 
swers ‘“‘true’’ to the item, “I have 


more trouble concentrating than 


others seem to have,”’ it is commonly 
assumed that these responses, if con- 


sistent, will indicate respectively 
something about the person’s atti- 
tude toward war or his mental state. 
Under these conditions response de- 
terminants such as the subjects’ gen- 
eralized tendency to agree are legiti- 
mately considered as sources of cu- 
mulative error, Cronbach’s (13, 14) 
familiar ‘‘response sets.”’ While 
Cronbach’s emphasis was that re- 
sponse sets often lead to errors of in- 
terpretation in the logical validity of 
tests, he also indicated that these re- 
sponse tendencies might not always 
be temporary and trivial, but may 
have a stable and valid component 
which reflects a consistent individual 
style or personality trait. While rec- 
ognizing Cronbach’s contribution in 
describing the phenomenon, it is pref- 


DOUGLAS N. JACKSON AND SAMUEL 


VESSICK 


erable for the present purposes to 
change the label from “‘response set”’ 
to components of style. This change 
in terms emphasizes the fact that for 
certain purposes in personality as- 
sessment opportunities for the ex- 
pression of personal modes for re- 
sponding should be enhanced and 
capitalized upon, rather than consid- 
ered as sources of error to be avoided 
or minimized. This change also 
avoids the ambiguity inherent in the 
concept of ‘‘set”’ (22). 


CHARACTERISTIC STYLES IN 
PERSONALITY AND ATTITUDE 
QUESTIONNAIRES 

Among the more prominent re- 
sponse styles usually evoked by ques- 
tionnaire items are response acquies- 
cence, overgeneralization, a tendency 
to respond in a socially desirable way, 
and the complementary tendencies to 
respond negativistically, critically, 
and in a socially undesirable or idio- 
syncratic manner. Some pertinent il- 
ilustrations will be drawn of how 
each of these, operating singly and in 
combination, may influence the in- 
terpretation of responses to psycho- 
logical tests. Alternative procedures 
for evaluating these stylistic varia- 
bles will then be discussed. 


Response Acqutescence and Authori- 
tarianism 


It has been long recognized that a 
subject who agrees with a personality 
or attitude item stated in a positive 
form may not necessarily disagree 
with its logical opposite, but may in- 
stead show a fairly general tendency 
toward agreement or disagreement. 
Studies by Rundquist and Sletto 
(49), by Lorge (42), and reviews by 
Cronbach (13, 14), Berg (8), and 
Messick and Jackson (45), indicate 
that response acquiescence is wide- 
spread and pervasive over a wide 





CONTENT AND STYLE IN PERSONALITY ASSESSMENT 


variety of item content and most pro- 
nounced when content is highly am- 
biguous or imaginary. Berg (8,9) has 
suggested that acquiescence is a mo- 
dal response in our culture when the 
issue before the respondents is unim- 
portant or nonexistent. 

The operation of such stylistic ten- 
dencies should be taken into account 
in the course of personality measure- 
ment. If a particular content area is 
to be assessed, it is at least necessary 
to introduce into the scaling proce- 
dure appropriate experimental con- 
trols for acquiescence, or else recon- 
cile oneself to interpretive equivocal- 
ity due to the confounding of content 
and style in a single measure. Other 
determinants 
quiescence, however, must be con- 
trolled before characteristics may be 
unequivocally attributed to respon- 
dents on the basis of item content. 
Messick and Jackson (45) have dis- 
cussed alternative methods for re- 
ducing this ambiguity of interpreta- 
tion in the measurement of authori- 
tarian attitudes.’ 

Even though much of the recent 
research with the California F 
(1) has been of a methodological and 
critical nature, it nevertheless vields 
some important information on the 
relationship between content and 
style. A number of investigators (5, 
46, 36, 37, 41, 45) have independently 


response besides ac- 


scale 


2 Gage, Leavitt, and Stone (20) have argued 
that confounding content and style in the F 
scale, far from being a source of error, is for- 
tunate, because acquiescence contributes to 
the empirical validity of the F scale as assessed 
by independent ratings of authoritarian be- 
havior. If the aim is merely to predict authori- 
tarianism as a criterion, like predicting the 
success of salesmen, this argument might be 
legitimate as long as the criterion did not 
change. But if one hopes to understand the 
various components of a dynamic construct 
like authoritarianism, conglomerate indices 
containing both content and style will not 
suffice and will confuse the issues (45). 


245 


correlated scores based on the Cali- 
fornia F scale, in which all of the 
items are so worded that agreement 
is always scored in the authoritarian 
direction, with scores based on log- 
ically reversed F-scale items. These 
correlations were not found to be high 
and negative, as would be expected 
from consistent responses to item 
content. With one reversed F scale 
(36), significant positive correlations 
in the acquiescence rather than 
the content direction were obtained. 
Furthermore, there is evidence (37) 
that previously obtained relation- 
ships between personality variables 
and the F scale, formerly thought to 
be interpretable in terms of correlates 
of authoritarian ideology or content, 
may need reinterpretation in terms of 
consistencies in style. The most re- 
cent study requiring such reinter- 
pretation is one by Gilbert and Lev- 
inson (23), in which a scale purport- 
edly measuring “custodial mental ill- 
ness ideology’’ was constructed, with 
17 of 20 items requiring agreement to 
be scored as ‘‘custodial ideology.”” A 
high correlation between the ‘‘cus- 
todial ideology”’ scale and the F scale 
was used to support the conclusion 
that “preference for a custodialistic 
orientation is part of a broader pat- 
tern of personal authoritarianism.” 
But Howard and Sommer in a repli- 
cation® found that ‘‘custodialism”’ 
correlated significantly with agree- 
ments to both the original and the 
Jackson-Messick (36) reversed F 
scales, which would seem to indicate 
that style rather than content is of 
primary importance in this instance. 

Christie, Havel, and Seidenberg 
(12) have shown that it is possible in 
some samples to obtain a correlation 


3’ Howard, T. W., & Sommer, R. “A Critical 
Examination of ‘Ideology, Personality, and 
Institutional Policy in the Mental Hospital.’” 
Unpublished manuscript. 








246 DOUGLAS N. 
between reversed and original F-scale 
items significant in the content direc- 
tion. Jackson, Messick, and Solley 
(37) had previously reported a corre- 
lation of +.35 between agreements to 
original and to reversed F-scale 
items. What accounts for this appar- 
ently considerable discrepancy? One 
set of investigators predicted and ob- 
tained a correlation significant in the 
acquiescence direction, while another, 
with a different reversed F scale, pre- 
dicted and obtained a correlation in 
the content direction. The answer to 
this question requires a consideration 
of more than differences in the con- 
tent of the two reversed F scales; the 
form of the items must be examined. 
Jackson and Messick (36) indicated 
that the original, extremely worded, 
cliché-ridden style of the F scale was 
retained in their reversals, while 
Christie, Havel, and Seidenberg (12) 
explicitly avoided the sweeping gen- 
eralizations found in the originals 
and substituted much more cautious 
statements. It is likely that this dif- 
ference in item form accounts for the 
different results of the two sets of in- 
vestigators. It appears that the ten- 
dency to endorse statements contain- 
ing phrases such as “every person,” 
‘no person,” “all,’’ ‘‘most impor- 
tant,’”’ ‘“‘complete certainty,” ‘‘nev- 
er,” 


‘“‘must,”’ etc., is a general one, 
which may act independently of the 
content. This response style to over- 
generalize may contribute to relation- 
ships between the F scale and cogni- 
tive variables like rigidity (37) and 
perceptual intolerance for ambiguity 


(18). It probably also partially ac- 
counts for the frequent observation 
that verbally elicited ethnic attitudes 
tend to be highly intercorrelated 
(10), even, for example, in Hartley’s 
(30) study where the ‘‘groups’’ were 
nonexistent and no previous attitude 
or ‘‘cognitive structure’’ could be as- 


JACKSON 


AND SAMUEL MESSICK 

sumed to exist. An appraisal of vari- 
ance associated with aspects of au- 
thoritarian content on one hand, and 
stylistic components like response ac- 
quiescence and overgeneralization on 
the other, would seem to require at 
least four sets of items: an extremely 
worded original and reversed F scale 
and a probabilistic original and re- 
versed F scale. It is suspected that 
subjects endorsing probabilistic F- 
scale items would not show as much 
of the ‘‘authoritarian’s” intolerance 
for ambiguity as might be expected, 
although some relationship between 
authoritarian ideology and response 
style might still be obtained. 


Response Acquiescence in Personality 
Inventories 

The distinct roles of content and 
stvle should also be noted in re- 
sponses to personality inventories, es- 
pecially those ‘‘true-false’’ devices 
like the MMPI developed by the em- 
pirical selection of discrimating items. 
While few, if any, investigators have 
ever explicitly assumed that the total 
number of empirically derived scales 
was the most parsimonious way of 
summarizing the common variance of 
an inventory, the use of a large num- 
ber of separate scales as, for example, 
in the 9 clinical scales of the MMPI 
or the 18 scales of Gough’s California 
Psychological Inventory, is justified 
by the extent to which each makes 
some independent contribution to the 
assessment problem not made by the 
other scales.‘ If there is a great deal 


‘The MMPI was advanced initially as an 
aid in the prediction of psychiatric diagnoses. 
In practice it is rarely so used in any literal 
sense, which is fortunate, as the research evi- 
dence (e.g., 7, 48) indicates that predictions 
of specific diagnoses generally cannot be made 
with certainty. Rather, the original purpose 
of the MMPI, prediction, has come to be 
modified so that now scores, singly or in com- 
bination, are used to draw inferences about 











of common variance among the vari- 
ous scales, this redundancy limits 
their efficiency. 

There is considerable evidence that 
a very few factors account for the ma- 
jor proportion of the variance on per- 
sonality inventories of the ‘‘true- 
false’ variety. Wheeler, Little, and 
Lehner (57), for example, reported a 
factor analysis of MMPI scales in 
which only two major factors and 
one minor factor were identified. In 
the light of accumulating evidence it 
seems likely that the major common 
factors in personality inventories of the 
true-false or agree-disagree type, such 
as the MMPI and the California Psy- 
chological Inventory, are interpret- 
able primarily in terms of style rather 
than specific item content. 

One line of departure from which it 
is possible to evaluate the role of ac- 
quiescence in personality inventories 
is to consider the percentage of items 
keved ‘‘true”’ in each scale as an in- 
dex of the extent to which that scale 
Jack- 
son (33) did this with the California 
Psychological Inventory, computing 
rank order correlations between the 
percentage ‘‘true’’ in each scale and 
the scale’s correlation with 
personality measures shown 


elicits response acquiescence. 


outside 
previ- 
ously to reflect acquiescence. A num- 
ber of high and significant correla- 
tions with such unidirectional scales 
as the California F and the 
MMPI K scale suggests strongly that 
acquiescence is a major source of 
variance in the CPI. 

Messick and Jackson® have ob- 
tained evidence of a similar nature for 


scale 


characteristics of respondents (56). Somewhat 
different notions of validity (15) and a dif- 
ferent mathematical model (27, 53) are neces- 
sary in the latter case. 

& Messick, S. J., & Jackson, D. N. “Re- 
sponse Style and Factorial Interpretation of 
the MMPI.” In preparation. 


CONTENT AND STYLE IN PERSONALITY ASSESSMENT 


247 


the MMPI. They obtained rank or- 
der correlations in the seventies be- 
tween each scale’s percentage “true”’ 
and its loading on the first factor as 
reported in each of several factor ana- 
lytic studies. Preliminary results sug- 
gest that the first factor of the MMPI 
is interpretable in terms of acquies- 
Equally striking is a recent 
factor analytic study by Welsh (55), 
who sought to obtain pure-factor 
MMPI scales through a variant of 
the internal consistency method. He 
was rather successful in developing 
two such scales, labeled A (for an- 
xiety) and R (for repression), which 
loaded highly on the first and second 
factors, respectively. The remark- 
able thing about these scales is that 
all but one of the 39 items measuring 
the first factor are keyed ‘‘true,” 
while all 40 items for the second pure 
factor scale are keyed “‘false.’’. Even 
though Welsh’s two scales are pre- 
dominantly unidirectional, one in the 
“true,”’ and the other in the ‘“‘false”’ 
direction, they vield only low nega- 
tive correlations with each other. 
This would lead one properly to re- 
ject the notion that a simple response 
set was sufficient to account for all of 
the variance in the two scales. Never- 
theless, each scale does seem to have 
an acquiescence component, for such 
a distribution of ‘“‘true’”’ and “‘false’’ 
items would be unlikely to occur by 
chance, and Jackson (33) has shown 
that correlations based on both scales 
correlated significantly with percent- 
age keyed “‘true”’ in each CPI scale. 
Thus, careful consideration must be 
given to the possibility that response 
acquiescence is interacting with an- 
other variable, either of content or of 
style, and that responses are deter- 
mined in part by this interaction. As 
with the F scale, where acquiescence 
operates most strikingly in conjunc- 
tion with statements in the form of 


Cence. 








248 DOUGLAS N. 
sweeping generalizations, it may be 
that acquiescence on the MMPI is 
elicited differentially by certain con- 
tent categories, or in relation with 
another stylistic component. 

The specific source of the variables 
which appear to moderate (50) the 
operation of response acquiescence in 
the MMPI is obviously a--compli- 
cated research problem which awaits 
more evidence for a definitive answer. 
One very promising lead, however, is 
encountered in another important 
stylistic determinant of test-taking 
behavior, the general tendency to en- 
dorse socially desirable or socially un- 
desirable statements about oneself. 
This stylistic response tendency on 
the part of individuals should be dis- 
tinguished from the judged charac- 
teristics of desirable and undesirable 


item content. There is considerable 


evidence that this tendency is general 
and is related to a tendency to re- 
spond in an idiosyncratic or atypical 


manner. Edwards (16) has reported a 
correlation of .87 between judged so- 
cial desirability scale values and the 
proportion of respondents indepen- 
dently endorsing them. Hanley (29) 
obtained correlations of .82 and .89 
respectively between probability of 
endorsement and social desirability 
ratings for samples of items from the 
MMPI D and Se scales. Fordyce 
(17) correlated with the MMPI clin- 
ical scales a set of MMPI items 
judged to be socially desirable. His 
obtained correlations were high, rang- 
ing from —.38 to —.91. Although 
these coefficients indicate the impor- 
tance of social desirability in scales 
like the MMPI, they also reflect the 
influence of response acquiescence, 
since the social desirability scale con- 
tained a disproportionate number of 
items keyed false. Jackson (33) 
showed that a combination of ranked 
indices of response acquiescence and 
social desirability on scales of the 


JACKSON AND SAMUEL MESSICK 


California Psychological Inventory 
was related to the rank of each scale’s 
correlation with the MMPI K scale 
to the extent of r,=.86. This value 
was higher than the correlation of ei- 
ther response style operating singly, 
suggesting the possibility of summa- 
tive effects of response acquiescence 
and social desirability. 

Berg (8, 9), granting that there are 
modal response patterns, suggested 
that individual differences, particu- 
larly deviations, may be revealing of 
personality stvle. Berg hypothesized 
that deviant behavior tends to be 
general and not specific to any par- 
ticular content area. Barnes (3, 4) 
appraising the Berg deviation hy- 
pothesis in the MMPI, shed impor- 
tant light on the relation between an 
acquiescent style and idiosyncratic 
responses. Barnes demonstrated a 
close correspondence between Wheel- 
er, Little, and Lehner’s (57) first or 
“psychotic” factor and total number 
of items answered deviantly true, and 
between their second or “neurotic” 
factor and total number of items an- 
swered deviantly false. Although re- 
sponse acquiescence and the tenden- 
cy to respond in a socially undesir- 
able or deviant manner are con- 
founded in Barnes’ analysis, these re- 
sults strongly support the notion that 
items judged low in social desirability 
evoke different tendencies toward ac- 
quiescence, as compared with items 
judged high in social desirability. 
This interpretation appears consis- 
tent with Welsh’s (55) data, where 
the first pure factor scale, composed 
of 38 ‘“‘true’’ items out of 39, con- 
tains many socially undesirable state- 
ments, while the second pure factor 
scale, where all the items are keyed 
false, seems to consist predominantly 
of neutral or somewhat socially de- 
sirable statements. Here again, a 
consistent response style to acquiesce 
seems to be elicited differentially by 








a variety of self-deprecatory state- 
ments on the one hand, while, alter- 
natively, neutral or mildly socially 
desirable statements evoke consistent 
differential tendencies to disagree or 
to be negativistic. 

Whether there are consistencies at- 
tributable to content after allowing 
for style in these first two factors or, 
indeed, in any obtained scores on the 
present form of the MMPI is an im- 
portant research question, as is the 
relation between various content and 
stvlistic factors and psychopatholo- 
gy. If Berg (8) is correct, if one might 
just as well use abstract drawings (3) 
as items to discriminate empirically 
psychiatric patients from normals, 
then it may be that content is less im- 
portant and style more important 
than previously supposed. If this is 
the case, then past attempts to draw 
conclusions about respodents on the 
basis of their answers to uncontrolled 
item content are suspect. If, on the 
other hand, consistencies in content 
can be demonstrated above and be- 
yond components of style, it is ex- 
tremely important that measures of 
these content variables make ade- 
quate use of proper experimental 
controls to avoid as far as possible 
confounding with style. Use of recent 
advances in scaling theory (27, 53) 
might be helpful. 

MEASURING PERSONALITY STYLES 

In approaching the problem of the 
assessment of style, a curious dilem- 
ma presents itself. On the one hand, 
it is easy to show that most person- 
ality tests are loaded with stvlistic 
components, but on the other hand, 
good measuring devices for these di- 
mensions do not exist, largely be- 
cause few research workers have at- 
tempted explicitly to devise such 
scales. Typically, a single measure, 
like the California F scale, the MMPI 
K scale, or Bass’s (6) collection of 


CONTENT AND STYLE IN PERSONALITY ASSESSMENT 


249 


aphorisms, has been offered as an in- 
dex of a response style, acquiescence, 
for example. Little thought is given 
to the fact that these measures may 
not only contain several dimensions 
of content, but of style as well, thus 
limiting their usefulness as indices of 
any particular style. Thus, Fordyce 
(17) has suggested that the MMPI 
K scale reflects tendencies to respond 
in a socially desirable manner, while 
Fricke (19) has argued that the A 
scale reflects acquiescence. Evidence 
from each of the two authors is con- 
vincing, and, indeed, Jackson's study 
(33) supports the notion that the A 
scale contains both acquiescence and 
social desirability variance. It may 
reflect other things as well, but this 
confounding is not conducive to its 
use as a measure of one particular 
style. The same criticism might be 
leveled at the California F scale, 
at Edwards’ (16) social desirability 
scale, and at Bass's (6) social acquies- 
cence scale, all of which seem to con- 
found response acquiescence with so- 
cial desirability. 

One way to construct measures of 
such styles as acquiescence or over- 
generalization would involve selec- 
tion of items extremely heteroge- 
neous in content. Experimentally in- 
dependent measures of each style 
would, of course, be desirable. Since 
a response style to answer in a so- 
cially desirable or undesirable direc- 
tion seems to be omnipresent, it is 
hard to avoid in measures of other 
styles. Rather than attempting to 
develop items all at one level of social 
desirability, it might be better to 
vary social desirability systemati- 
cally and to observe its relationships 
and interactions with other variables. 
Helmstadter (31) has described pro- 
cedures for obtaining separate scores 
for different components of a test, 
some of which would be especially 
relevant to a situation in which one 








250 DOUGLAS N, JACKSON 
had already obtained social desira- 
bility scale values. Although social 
desirability has been assumed to be 
one-dimensional, it is easy to con- 
ceive of distinct, but perhaps corre- 
lated, dimensions consisting of items 
reflecting irresponsibility, psychiatric 
bizarreness, or hostility. The selec- 
tion of sets of items for different di- 
mensions of judged social desirability 
would be facilitated by the applica- 
recent advances in multidi- 
mensional scaling (44). Such refine- 
ments as separating out the compo- 
nents of social desirability would do 
much to clarify response determi- 
nants and might put personality eval- 
uation upon a more rigorous basis 
than has previously been thought 
possible. 

Although the emphasis in this pa- 
per has been on some of the more con- 
spicuous stylistic determinants en- 
countered in common personality 


ion of 


tests, there are many other possible 


measures of style that might be de- 
rived from personality theory. For 
example, a tendency to express a lik- 
ing for diverse things, although it 
might be response acquiescence in a 
new disguise, might also represent 
greater cognitive differentiation or 
capacity to invest energy freely in 
objects in one’s environment. Such 
general expressions of “‘like’’ and 
“dislike” have been found to be reli- 
able. On one set of 300 items dealing 
with diverse activities (51), the cor- 
rected split-half reliability of the 
tendency to respond “‘like’’ was .86. 
With a paucity of evidence on these 
issues, the alternative to such conjec- 
ture is carefully planned research, for 
which there is an obvious need. 
There are many other research oppor- 
tunities for the measurement of style, 
such as asking respondents to select 
from among two or more personality, 


AND SAMUEL MESSICK 


attitude, or achievement items, equal 
in valence or correctness, but couched 
in different phrasings—perhaps one 
elaborate and pedantic, one simple, 
and one containing slang. Preferred 
modes or styles of expression might 
also be readily evaluated by tech- 
niques disguised as achievement tests 
(32). In this context, it would be in- 
teresting to evaluate personality cor- 
relates of such attributes as toler- 
ance for logical contradictions within 
a passage, of a tendency to gamble on 
achievement tests (28, 54), and a va- 
riety of other consistent modes of re- 
sponse. Similarly, further research is 
needed to evaluate Jackson's (34, 35) 
hypothesis that respondents who ac- 
quiesce consistently manifest a lower 
level of cognitive energy in other sit- 
uations. 


SUMMARY 


It has been suggested that stylistic 
determinants, such as acquiescence, 
overgeneralization, and a tendency to 
respond in a_ socially undesirable 
manner, as distinct from specific con- 
tent, account for a large proportion of 
response variance on some personali- 
ty scales, particularly the California 
F scale, the MMPI, and the Cali- 
fornia Psychological Inventory. In 
developing and evaluating measures 
of style it is important to select not 
only those measures which have ap- 
peared by accident on already estab- 
lished tests, but to design assessment 
techniques explicitly to evoke theo- 
retically important styles of response. 
Research involving response style 
may contribute to a more systematic 
measurement in personality and may 
pay off handsomely in helping to fur- 
ther the common ground between 
personality theory and personality 
assessment. 








? 


o>) 


in 


6 


16. 


CONTENT AND STYLE IN PERSONALITY ASSESSMENT 251 


REFERENCES 


Aporno, T. W., FRENKEL-BRUNSWIK 
ELseE, Levinson, D. J., & SANFORD 
R. N. The authoritarian personality. 
New York: Harper, 1950. 

ALLport, G. W., & VERNON, P. E 
Studies in expressive movements. New 
York: Macmillan, 1932. 

BarNEs, E. H. Response bias in the 
MMPI. J. consult. Psychol., 1956, 20, 
371-374. 

Barnes, E. H. Factors, response bias, 
and the MMPI. J. consult. Psychol., 
1957, 20, 419-421. 

Bass, B. M. Authoritariansim or acquies- 
cence? J. abnorm. soc. Psychol., 1955, 
51, 611-623. 

Bass, B. M. Development and evalua- 
tion of a scale for measuring social 
acquiescence. J. abnorm. soc. Psychol., 
1956, 53, 296-299 

Benton, A. L. The MMPI in clinical 
practice. J. nerv. ment. Disease, 1945, 
102, 416-420. 

BerG, I. A. Response bias and personal- 
ity: the deviation hypothesis. Bi 
Psychol., 1955, 40, 61-72 

Bera, I., & Rapaport, G. M. Response 
bias in an unstructured questionnaire. 


J. Psychol., 1954, 38, 475-481. 


. CAMPBELL, D. T., & McCCANDLEss, B. R. 


Ethnocentrism, xenophobia, and _per- 
sonality. Hum. Relat., 1951, 4, 185-192. 


. Cuapman, L. J., & Camppett, D. T 


Response set in the F scale. J. abnorm. 
soc. Psychol., 1957, 54, 129-132 


. Curistrie, R., HAvet, JOAN, & SEIDEN- 


BERG, B. Is the F scale irreversible? 
J. abnorm. soc. Psychol., 1958, 56, 143- 
159. 

CRONBACH, L. J. Response sets and test 
validity. Educ. psychol. Measmt, 
1946, 6, 475-494. 

CronsBacu, L. J. 
response sets and test design. 
psychol. Measmt, 1950, 10, 3-31. 

CronBacH, L. J., & MEEHL, P. E. Con- 
struct validity in psychological testing. 
Psychol. Bull., 1955, 52, 281-302. 

Epwarps, A. L. The relationship between 
the judged desirability of a trait and the 
probability that the trait will be en- 
dorsed. J. appl. Psychol., 1953, 37, 
90-93. 


Further evidence on 
Educ. 


. Forpyce, W. E. Social desirability in the 


MMPI. Bi 
20, 171-175. 
FRENKEL-BRUNSWIK, ELSE. 


consult. Psychol., 


1956, 


Intolerance 


we 


. Jackson, D. N. 


of ambiguity as an emotional and per- 
ceptual personality variable. J. Pers., 
1949, 18, 108-1453. 

Fricke, B. G. Response set as a sup- 
pressor variable in the OAIS and 
MMPI. J. consult. Psychol., 1956, 
20, 161-169. 

Gace, N. L., Leavitt, G. S., & STONE, 
G. C. The psychological meaning of 
acquiescence set for authoritarianism. 
J. abnorm. soc. Psychol., 1957, 55, 98- 
103 

GARDNER, R. W. 
categorizing behavior. 
22, 214-223. 


GIBSON, 5. a 


Cognitive styles in 
J. Pers., 1953, 


i A critical review of the 
concept of set in contemporary experi- 
mental psychology. Psychol. Bull., 1941, 
38, 781-817. 

GILBERT, Doris C., & Levinson, D. J. 
Ideology, personality, and institutional 
policy in the mental hospital. J. 
abnorm. soc. Psychol., 1956, 53, 263- 
271. 

Goucu, H. G. Studies of social intoler- 
ance: I-IV. J. soc. Psychol., 1951, 33, 
237-271. 

GouGu, H. G. Predicting social partici- 
pation. J. soc. Psychol., 1952, 35, 227- 
233. 

GouGu, H. G. California Psychological 
Inventory Manual. Palo Alto: Consult- 
ing Psychologist Press, 1957. 


7. GREEN, B. F. Attitude measurement. In 


G. Lindzey (Ed.), Handbook of social 
psychology, Vol. 1. Cambridge: Addison- 
Wesley, 1954. 

GuILForD, J. P., & Lacey, J. I. (Eds.) 
Printed classification tests. Washington: 
U.S. Government Printing Office, 1947. 

HANLEY, C. Social desirability and re- 
sponses to items from three MMPI 
scales: D, Sc, and K. J. appl. Psychol., 
1956, 40, 324-328. 

HartLey, E. L. Problems in prejudice. 
New York: Kings Crown Press, 1946. 

HELMSTADTER, G. C. Procedures for 
obtaining separate set and content com- 
ponents of a test score. Psychometrtka, 
1957, 22, 381-394. 

Hitts, J. R. Objective tests of person- 
ality for practical use. Princeton, N. J.: 
Educational Testing Service Research 
Memorandum 57-4, 1957. (Multi- 
lithed report.) 

Response acquiescence 


in the California Psychological Inven- 








DOUGLAS N. 


tory. Amer. Psychologist, 1957, 12, 
412-413. (Abstract). 

. Jackson, D. N. Independence and resist- 
ance to perceptual field forces. J. 
abnorm. soc. Psychol., in press. 

. Jackson, D. N. Cognitive energy level, 
response acquiescence, and authori- 
tarianism. J. soc. Psychol., in press. 

. Jackson, D. N., & Messick, S. J. A 
note on ethnocentrism and acquiescent 
response sets. J. abnorm. soc. Psychol., 
1957, 54, 132-134. 

. Jackson, D. N., Messick, S. J., & SOLLEY 
C. M. How “rigid” is the ‘‘authori- 
tarian’’? J. abnorm. soc. Psychol., 1957. 
54, 137-140. 

. Jackson, D. N., Messick, S. J., & Sor- 
Ley, C. M. A multidimensional scaling 
approach to the perception of person- 
ality. J. Psychol., 1957, 44, 311-318. 

. Krier, G. S. The Menninger Foundation 
research on perception and personality. 
1947-1952: a review. Bull. Menninger 
Clin., 1953, 17, 93-99. 

. Kvern, G. S. Need and regulation. In 
M. R. Jones (Ed.), Nebraska Sym- 
posium on Motivation. Lincoln: Univer. 
Nebraska Press, 1954. 

. Leavitt, H. J., Hax, H., & Rocue, J. H. 
“Authoritarianism’’ and agreement 


with things authoritative. J. Psychol., 


1955, 40, 215-221. 

. LorGce, I. Gen-like: Halo or reality? 
Psychol. Bull., 1937. 34, 545-546. 

. MEERL, P. E. The dynamics of ‘‘struc- 
tured’’ personality tests. J. clin. Psy- 
chol., 1945, 1, 296-303. 

. Messick, S. J. Some recent theoretical 
developments in multidemensional scal- 
ing. Educ. psychol. Measmt, 1956, 16, 
82-100. 

. Messick, S. J., & Jackson, D. N. The 
measurement of authoritarian attitudes. 
Educ. psychol. Measmt, in press. 

. Messick, S. J., & Jacxson, D. N. 
Authoritariansim or acquiescence in 
Bass’s data. J. abnorm. soc. Psychol., 
1957, 54, 424-426. 


JACKSON AND SAMUEL 
47. 


48. 


56. WELSH, 


59. Wotrr, W. 


MESSICK 


Murpny, G. Personality. New York: 
Harper, 1947. 

Rusin, H. Validity of a critical-item 
scale for schizophrenia on the MMPI. 
J. consult. Psychol., 1954, 18, 219-220. 

Runpguist, E. A., & Sretro, R. F. 
Personality in the depression. Min- 
neapolis: Univer. of Minnesota Press, 
1936. 

. SauNDERS, D. R. Moderator variables in 
prediction. Educ. psychol. Measmt, 
1956, 16, 209-222. 

. STERN, G. G., Stern, M. I., & BLoom, 
B.S. Methods in personality assessment. 
Glencoe, Ill.: Free Press, 1956. 

. THursTONE, L. L. A factorial study of 
perception. Chicago: Univer. Chicago 
Press, 1944. 

. TorGERSON, W. S. Theory and methods 
of scaling. New York: Wiley, iv press. 

. VERNON, P. E. Personality tests and 
assessments. New York: Holt, 1953. 

. Wevsu, G. S. Factor’ dimensions A and 
R. In G. S. Welsh and W. G. Dahl- 
strom (Eds.), Basic readings on the 
MMPI. Minneapolis: Univer. of Minn. 
Press, 1956. Pp. 264-281 

G. S., & Dautstrom, W. G. 
(Eds.) Basic readings on the MMPI 
Minneapolis: Univer. of Minn. Press, 
1956. Pp. 290-337. 

. WHEELER, W. M., LittLe, K. B., & 
LEHNER, G. F. J. The internal struc- 
ture of the MMPI. J. consult. Psy- 
chol., 1951, 15, 134-141. 

. Witkin, H. A., Lewis, H. B., HERTZMAN, 
M., MAcHover, K., MEISSNER; P. B., 
& WAPNER, S. Personality through per- 
ception. New York: Harper, 1954. 

Expression of personality 
New York: Harper, 1943. 

. ZIMMERMAN, W.S. The influence of item 

complexity upon the factor composition 

of a spatial visualization test. Educ. 

psychol. Measmt, 1954, 14, 106-119. 


Received November 14, 1957. 





PSYCHOLOGICAL BULLETIN 
VoL. 55, No. 4, 1958 


A NEED FOR ALERTNESS TO MULTIVARIATE EXPERIMENTAL 


FINDINGS IN 


INTEGRATIVE SURVEYS 


RAYMOND B. CATTELL 
Laboratory of Personality Assessment and Group Behavior 
University of Illinois 


A number of glaring omissions of 
relevant data have been evident in 
surveys of particular scientific fields 
made over the past 10 On 
closer examination it evi- 
dent that these omissions follow a 
certain pattern. It seems most de- 
sirable to call attention to this un- 
realized source of distortion in the in- 
terests of a fuller use of experimental 
resources. 

In general, the reviewer of a field 
does a thorough job in terms of col- 
lecting, from P!ychological Abstracts, 
and the trail leading through bibli- 
ographies, the principal findings of 
univariate experiments, i.e., those us- 
ing the classical design of varying an 
“independent variable’? and observ- 
ing the changes in a “dependent vari- 
able.”’ But his final list of discovered 
associations, for the variable or con- 
cept on which the survey is focused, 
frequently shows, by contrast, a 
shockingly inefficient reporting of the 
associations found in multivariate ex- 
periments, e.g., factor analytic experi- 
ments, in which his particular con- 
cept or variable happens to have been 
investigated in company with a whole 
group of variables. 

It is easy to see how this happens. 
After setting down the experiments 
on the subject that are known to him 
through personal interest, or through 
discussion with a circle of scientific 
acquaintances, the person charged 
with the task of making an integra- 
tive survey of the area turns to the 
literature. As indicated above, he is 
likely to turn to the index of Psycho- 
logical Abstracts and the annual in- 


vears. 
becomes 


dexes of a wide array of journals, and 
to look up all reported studies ap- 
pearing under the rubric of the given 
topic, as well as under various possi- 
ble synonyms for that rubric. If his 
vision is no wider than that of a train- 
ing in the traditional, univariate ex- 
periment, he then closes the books 
and feels justified that he has made a 
thorough search for all relevant 
studies occurring in the given period 
of time. 

Unfortunately, he has omitted the 
whole research reports in 
which no reference to all the concepts 
and variables involved in the given 
study can be made in the title. It oc- 
casionally happens in univariate stud- 
ies that the statistically significant 
finding does not occur in the main 
relation, but in terms of some “‘by- 
product” finding. But it happens svs- 
tematically and constantly in the 
area of multivariate designs that one 
study deals with, or reveals, more sig- 
nificant relations, and bears on more 
concepts, than can possibly be indi- 
cated, even with a supreme effort at 
theoretical condensation, in a title. 
Most concepts hinge, operationally, 
upon a variable. Nowadays, factor 
analytic studies, with the aid of the 
electronic computer, may work with 
more than a hundred variables and 
five thousand relationships, which no 
title satisfactory to an editor could 
contain. 

Parenthetically, this note is not at 
all concerned with the relative merits 
of univariate and multivariate ex- 
perimental designs. Each has its ad- 
vantages and disadvantages, its time 


class of 


253 


we 








254 RAYMOND 
for minor and its time for major roles 
in research. The multivariate design 
permits a great number of relations of 
stimuli and responses to be simulta- 
neously examined, but it can examine 
only linear or roughly linear relations 
among them.* However, as far as 
surveys of relationships of particular 
variables are concerned, it is extreme- 
ly unusual, even in univariate stud- 
ies, for the integrator to be con- 
cerned with statistically significant 
departures from linear relationships, 
so there is no argument for omission 
of multivariate exper/mental findings 
on these grounds. 

It may be objected that though the 
variables used in a_ well-conceived 
multivariate experiment cannot be 
crammed into the title, such experi- 
ments nevertheless deal with relative- 
ly few concepts, and these should at 
least enter the title as a guide to the 
searcher on related themes. -Occa- 


sionally this can be done, as when a 


title on, say, primary abilities, indi- 
cates most of the theoretical interest 
attaching to the variables. But in 
most researches there are as many di- 
verse concepts involved as there are 
factors, and many theories 
there are relationships among fac- 
.tors. There is, indeed, always a sys- 
tematically greater number of vari- 
ables and factors than are directly 
concerned with the main theme of 
the study. For a well designed multi- 
variate research recognizes John Stu- 
art Mill’s proposition in scientific 
method—that the definition of ‘A’”’ 
depends also on knowing the “‘not- 
A” phenomena. Or, to restate 
this from the statistical standpoint, 
whereas the univariate experiment is 
concerned to know what part of the 
dependent variable variance is asso- 
ciated with the independent vari- 
able, and is content to regard the rest 
as “‘error,’’ the multivariate experi- 


as as 


B. 


CATTELL 


ment secondarily determines .a/so 
where the variance not connected 
with the main independent variable is 
tied up. 

Consequently the researcher) so- 
phisticated in multivariate experi- 
mental designs realizes that no mat- 
ter what the specialized concept —be 
it dependent or independent variable 

on which he is making a survey, it 
is vitally necessary to peruse the lists 
of variables and panorama of rela- 
tionships presented in multivariate 
researches. This must be done prac- 
tically without regard to the article 
titles which the experimenter’s pre- 
dilections or the editor's love of brev- 
ity have imposed. The rewards, in 
avoidance of embarrassing omissions 
and in discovery of some of the most 
significant relations, gained from 
such heightened alertness to data, are 
very great. 

If space permitted it would be re- 
grettably easy to instance surveys in 
the past 10 years which have been ab- 
surdly incomplete through lack of re- 
gard for this injunction. For exam- 
ple, there have been surveys on flu- 
ency and speed of reaction which 
have completely omitted the wide 
range of connections found for such 
variables in the factor analytic stud- 
ies of Guilford and his co-workers on 
creativity. There have been studies 
dealing with concepts of natural tem- 
po and speed which omit the many 
associations found in the factor ana- 
lytic study by Rimoldi. There have 
been survevs of evidence the 
meaning of the galvanic skin re- 
sponse, flicker fusion, rigidity, gestalt 
closure, color/form reactions, rate o1 
conditioning and extinction, anxiety, 
persistence of sensory afterimages, 
reaction to authority, oscillation, sug- 
gestibility, response sets in test per- 
formance, effects of emotional expe- 
rience on memory, etc., which have 


on 








completely failed to take into ac- 
count, for their special purposes, 
systems of significant relationships 
found between these variables and 
literally hundreds of others in the 
present writer's studies on objective 
personality tests. And there have 
been surveys of projective principles, 
of experimental. work on defense 
mechanisms, etc., which have again 
failed to take into account numerous 
findings in the factor analytic studies 
of Eysenck, of the present writer, and 
of various other researchers who have 
experimented on these principles and 
mechanisms in a larger framework of 
multivariate design. 

The irony of this ignorance is that 
one single multivariate study will 
sometimes yield more information on 
the validity or nonvalidity of a set of 
hypothesized connections than will 
all the univariate studies which have, 
up to that point, made up the given 
bibliography. A single experimental 
study using a multivariate design 
with » variables will normally deal 
with as many relations for a given 
variable as will » different univariate 
studies. The surveyor of such studies 
is not bound to accept the factor ana- 
lytic solution offered—a _ fortunate 
circumstance in view of the small 
proportion of factor analytic studies 
reaching acceptable standards of ro- 
tational resolution—for he can sim- 
ply go to the correlations in the origi- 
nal correlation matrix. In the pres- 
ent writer's experience, one glance at 
such a matrix (on a sufficient sample) 
has frequently been sufficient to de- 
molish some theory on which a stu- 
dent tnterested in a particular vari- 
able has been about to start a univar- 
iate research and to suggest some 
richer hypothesis with a greater like- 
lihood of survival. 

In summary, the call for more sys- 
tematic attention to the correlation 


ALERTNESS TO MULTIVARIATE EXPERIMENTAL FINDINGS 255 


matrices available in multivariate 
studies, when integrating knowledge 
on a particular variable or concept, 
is based on these facts: (a) Even one 
such study usually explores a far 
greater range of relationships than in 
two or three dozen univariate studies. 
(b) The form of the data eliminates 
one of the main difficulties which the 
survey integrator experiences, name- 
ly, that of allowing for effects of 
slight differences of sample, testing 
procedure, etc., in combining conclu- 
sions on the various relations from 
many univariate studies. (c) In the 
final inferential reasoning and modi- 
fication of concepts which are the cul- 
mination of such surveys, the integra- 
tor of univariate studies is apt to find 
himself trying to reach a conclusion 
by now partialling out this influence 
and now that, or asking what would 
have happened if this or that had been 
held constant. Usually this is fraught 
with much error and conjecture, but 
in the multivariate study, if he cares 
to proceed to the factor analysis, this 
partialling of influences has already 
been done for him. He is not com- 
pelled to go around in a circle of un- 
certain partiallings out, and even if 
he should disagree with the particu- 
lar factor analytic resolution he is 
free to try a change of rotation. 
Thus, not only does the omission of 
a single multivariate factor analytic 
study from a survey often mean the 
loss of as much sheer relational infor- 
mation as that of a hundred univari- 
ate studies, but it sometimes means 
also the loss of possible definiteness 
of conclusions, which could have re- 
placed the speculative adumbrations 
in which the integrator was finally 
compelled to take refuge in his at- 
tempt to collate many only partially 
equivalent univariate studies. 
Surely one of the main aims of re- 
search surveys is to give dependable 








256 


glimpses of emerging order in a way 
that will permit formulation of hy- 
potheses, for further research, that 
will have a high probability of sur- 
vival. A sampling of Ph.D. theses in 
university libraries suggests that 
graduate students are all too infre- 
quently taught the wisdom of survey- 
ing multivariate research findings be- 
fore choosing their hypotheses. For, 
indeed, some of the questions they 
have set themselves will be found al- 
ready answered in published corre- 
lation matrices. But many journal 
surveys, by psychologists more ma- 
ture at least chronologically than 


Ph.D. researchers, unfortunately set 
no model of acumen in this respect. 
Possibly the situation can be reme- 
died without reforming our reading 
habits simply by some inventor’s in- 


RAYMOND B. CATTELL 


troducing a mode of labeling and in- 
dexing of multivariate studies which 
is much more subtle than the present 
ones, using such titles as abilities, 
anxiety, Motivation or temperament, 
and which will instead give access to 
the manifold specific experimental 
variables and relations reported upon 
in multivariate studies. But miracles 
are unlikely, and meanwhile our only 
insurance against the systematic loss 
of scientific information here de- 
scribed is for compilers of surveys to 
become more intelligently alert to the 
numerous relationships of psycholog- 
ical and physiological variables com- 
monly hidden in multivariate studies 
and to become more conscientiously 
motivated to seek them out. 


Received November 25. 1957. 














